INDEX
Explanations
words related to violent actions or events
occurrences of the word "one"
New Auto-Interp
Negative Logits
uits
-0.84
ships
-0.79
hips
-0.78
ooks
-0.77
inders
-0.73
lems
-0.72
osponsors
-0.71
emies
-0.68
ographics
-0.67
zos
-0.67
POSITIVE LOGITS
hundred
1.01
unnamed
1.01
person
0.93
woman
0.90
unidentified
0.87
thousand
0.83
guy
0.82
participant
0.81
elderly
0.80
protester
0.79
Activations Density 0.097%