INDEX
Explanations
mentions of political regimes and governments in conflict situations
New Auto-Interp
Negative Logits
ember
-0.16
nze
-0.15
Href
-0.15
ladu
-0.14
teg
-0.14
رخ
-0.14
ovit
-0.14
acular
-0.14
itag
-0.13
æĴ
-0.13
POSITIVE LOGITS
429
0.15
achs
0.14
大人
0.14
Spare
0.14
.listFiles
0.14
çıkart
0.14
isky
0.14
Seks
0.14
è¨Ģãģ£ãģ¦
0.13
viar
0.13
Activations Density 0.028%