INDEX
Explanations
references to violence and its various contexts
New Auto-Interp
Negative Logits
leanup
-0.16
CAST
-0.16
alle
-0.16
roud
-0.15
/stdc
-0.15
owitz
-0.15
elian
-0.15
Ñıг
-0.14
689
-0.14
chg
-0.14
POSITIVE LOGITS
vens
0.20
-force
0.18
force
0.17
essel
0.16
/angular
0.15
force
0.15
adier
0.15
Force
0.14
339
0.14
olson
0.14
Activations Density 0.019%