INDEX
Explanations
mentions of terrorist organizations and related names
New Auto-Interp
Negative Logits
.sponge
-0.18
ç¿
-0.15
artz
-0.15
HAL
-0.15
assi
-0.15
ifndef
-0.14
ORB
-0.14
arts
-0.14
autos
-0.14
Äijứng
-0.13
POSITIVE LOGITS
-Qaeda
0.26
Qaeda
0.25
aeda
0.21
queda
0.18
Gore
0.17
hur
0.17
leg
0.16
azeera
0.16
-Q
0.16
queda
0.16
Activations Density 0.008%