INDEX
Explanations
references to political figures and their statements
New Auto-Interp
Negative Logits
_QMARK
-0.16
EPROM
-0.15
arkan
-0.15
edis
-0.15
nees
-0.14
ocab
-0.14
gebn
-0.14
passes
-0.14
mé
-0.14
DMIN
-0.14
POSITIVE LOGITS
ano
0.17
Pref
0.16
al
0.16
vo
0.15
ÙİÙĬ
0.15
avia
0.15
view
0.14
on
0.14
avn
0.14
Pref
0.14
Activations Density 0.219%