INDEX
Explanations
statements regarding political allegations and claims
New Auto-Interp
Negative Logits
#
-0.18
ua
-0.15
PREF
-0.15
kaç
-0.15
shima
-0.15
equality
-0.15
-reply
-0.14
ลาà¸Ķ
-0.14
leton
-0.13
aru
-0.13
POSITIVE LOGITS
Marine
0.16
eyh
0.16
presidency
0.14
ipple
0.14
Donald
0.14
izio
0.14
himself
0.14
Pres
0.14
Rug
0.14
ardash
0.14
Activations Density 0.198%