INDEX
Explanations
expressions of political sentiment or actions related to governmental functions
New Auto-Interp
Negative Logits
ihar
-0.14
rych
-0.14
vere
-0.14
PasswordEncoder
-0.14
دÙĪØ¯
-0.14
åı¤
-0.14
ë¹Ī
-0.14
LEC
-0.14
éĸ
-0.13
ади
-0.13
POSITIVE LOGITS
plots
0.24
Zion
0.23
conspir
0.21
Zionist
0.21
Tak
0.21
intrig
0.20
colonial
0.20
plot
0.19
US
0.19
occupation
0.18
Activations Density 0.049%