INDEX
Explanations
phrases related to physical acts of aggression or force
references to physical violence and conflict-related actions
New Auto-Interp
Negative Logits
interstitial
-0.81
culosis
-0.70
folio
-0.68
occupied
-0.65
worm
-0.65
ears
-0.64
bee
-0.64
JO
-0.62
artist
-0.62
ASAP
-0.62
POSITIVE LOGITS
punches
1.30
deliberations
0.98
arnaev
0.81
;;;;;;;;
0.71
ãĥĥãĥī
0.70
=-=-=-=-=-=-=-=-
0.70
ogun
0.68
sidx
0.68
Rus
0.67
åij
0.67
Activations Density 0.001%