INDEX
Explanations
statements expressing skepticism or criticism of political systems and their integrity
New Auto-Interp
Negative Logits
fur
-0.16
fur
-0.15
Fur
-0.14
ussen
-0.13
wyn
-0.13
.Setter
-0.13
anager
-0.13
'[
-0.13
.synthetic
-0.13
Ost
-0.13
POSITIVE LOGITS
yor
0.19
edis
0.16
öh
0.15
ARS
0.15
eczy
0.14
azo
0.14
sé
0.14
RS
0.14
élé
0.14
Comprehensive
0.14
Activations Density 0.000%