INDEX
Explanations
terms related to violence and conflict
New Auto-Interp
Negative Logits
gub
-0.49
Lazy
-0.47
lazy
-0.47
sou
-0.46
懒
-0.42
elm
-0.41
Lazy
-0.41
cre
-0.41
small
-0.40
top
-0.40
POSITIVE LOGITS
attack
1.90
violence
1.84
attacks
1.79
massacre
1.77
assault
1.75
Violence
1.65
assaults
1.65
Attacks
1.65
violent
1.61
massacres
1.60
Activations Density 3.363%