INDEX
Explanations
references to extremist groups
New Auto-Interp
Negative Logits
apolis
-0.16
رÙĪØª
-0.14
sville
-0.14
alars
-0.13
inox
-0.13
occo
-0.13
ADX
-0.13
ÙĬÙĦا
-0.13
ecko
-0.13
(P
-0.13
POSITIVE LOGITS
-Qaeda
0.26
-'
0.23
-Q
0.22
aa
0.22
Qaeda
0.22
-Z
0.21
-Jul
0.21
-N
0.21
-J
0.21
-Sh
0.21
Activations Density 0.017%