INDEX
Explanations
references to war-themed titles or topics in film
New Auto-Interp
Negative Logits
_MISC
-0.16
оÑħ
-0.15
ỳ
-0.15
Skywalker
-0.14
mbH
-0.14
ucha
-0.14
ackbar
-0.14
ettes
-0.14
низ
-0.14
NonNull
-0.14
POSITIVE LOGITS
625
0.17
oner
0.16
Gen
0.15
arel
0.15
857
0.14
925
0.14
ycz
0.14
iden
0.14
Agency
0.14
ÂŃi
0.14
Activations Density 0.074%