INDEX
Explanations
words related to death and violence
New Auto-Interp
Negative Logits
saraba
-0.57
httphttps
-0.56
despe
-0.54
iformis
-0.52
++++++++++++++++
-0.51
lenker
-0.51
zionato
-0.51
ModelExpression
-0.51
looped
-0.50
mobileqq
-0.50
POSITIVE LOGITS
killed
0.62
killed
0.58
Killed
0.56
Killed
0.48
injured
0.46
attacked
0.44
Савезне
0.43
Ahnung
0.42
tué
0.41
promoción
0.41
Activations Density 0.226%