INDEX
Explanations
references to political events or actions
New Auto-Interp
Negative Logits
å¹²
-0.16
esda
-0.16
cliffe
-0.15
tain
-0.14
scaleY
-0.14
Ùĩر
-0.14
inidad
-0.14
tank
-0.14
_EXTERN
-0.14
поÑħ
-0.14
POSITIVE LOGITS
cent
0.15
ved
0.14
692
0.14
practical
0.14
Rash
0.14
änd
0.14
ILT
0.14
aram
0.14
oogle
0.14
Bak
0.14
Activations Density 0.373%