INDEX
Explanations
phrases related to criminal activity or violent incidents
New Auto-Interp
Negative Logits
785
-0.17
hum
-0.16
bjerg
-0.16
uzzi
-0.16
rophy
-0.15
/Dk
-0.15
Shared
-0.14
åĢĻ
-0.14
spb
-0.14
istine
-0.14
POSITIVE LOGITS
resist
0.15
auer
0.15
à¤īसन
0.14
stubborn
0.14
IsActive
0.14
elts
0.13
Beats
0.13
0.13
resistant
0.13
cooperation
0.13
Activations Density 0.021%