INDEX
Explanations
words related to scenes or settings in a story
references to specific scenes in media
New Auto-Interp
Negative Logits
orem
-0.85
thood
-0.80
achev
-0.80
rition
-0.78
dp
-0.77
berman
-0.75
reditary
-0.75
heit
-0.72
ikarp
-0.72
gements
-0.72
POSITIVE LOGITS
Dialogue
1.02
scenes
0.95
Capture
0.94
unfolding
0.92
Scenes
0.92
involving
0.87
depicting
0.80
Scene
0.79
scene
0.75
depicted
0.75
Activations Density 0.036%