INDEX
Explanations
film titles, genres, and elements related to American action films and B-movies
New Auto-Interp
Negative Logits
ane
-0.16
âĢķâĢķâĢķâĢķ
-0.15
ASS
-0.15
osal
-0.15
Bast
-0.14
amel
-0.14
resi
-0.14
oren
-0.14
ufe
-0.14
NECT
-0.14
POSITIVE LOGITS
ropa
0.17
Rope
0.16
qua
0.16
ropes
0.16
boots
0.15
rope
0.15
Gloss
0.15
isco
0.14
OUNDS
0.14
eof
0.14
Activations Density 0.049%