INDEX
Explanations
references to violence or conflict involving authority figures
New Auto-Interp
Negative Logits
Autorisations
-0.40
Autorizaciones
-0.36
र्भ
-0.35
raulic
-0.31
⎩
-0.30
/#{-0.29
'__
-0.29
Ladd
-0.29
apt
-0.28
apt
-0.28
POSITIVE LOGITS
للاسماء
0.68
killing
0.67
assassinated
0.62
indígen
0.61
assassination
0.60
Killing
0.58
surla
0.58
Geſ
0.57
DeleteMapping
0.57
houſe
0.57
Activations Density 0.055%