INDEX
Explanations
incidents related to violence and fatalities
New Auto-Interp
Negative Logits
ÑĸнÑĮ
-0.17
remen
-0.17
ertiary
-0.15
ServletResponse
-0.14
aca
-0.14
Loot
-0.14
ookie
-0.13
аÑĥд
-0.13
orden
-0.13
anske
-0.13
POSITIVE LOGITS
Narr
0.17
Highlander
0.15
Intelligence
0.15
BILE
0.15
Churchill
0.15
antro
0.14
Civ
0.14
symp
0.14
alleged
0.14
allegedly
0.14
Activations Density 0.034%