INDEX
Explanations
references to action and adventure themes in films
New Auto-Interp
Negative Logits
ALSE
-0.17
ontent
-0.16
Telescope
-0.15
eteor
-0.15
dream
-0.15
ds
-0.14
aight
-0.14
Miner
-0.14
edith
-0.14
Uvs
-0.14
POSITIVE LOGITS
Vig
0.17
Femme
0.17
kill
0.17
dispatch
0.16
hack
0.16
carn
0.16
sovereign
0.16
ticking
0.16
Agent
0.16
hired
0.15
Activations Density 0.183%