INDEX
Explanations
mentions of specific political figures
New Auto-Interp
Negative Logits
ast
-0.06
oÅĻ
-0.06
aster
-0.06
q
-0.06
Ñĸб
-0.06
stead
-0.06
old
-0.05
\Migrations
-0.05
unga
-0.05
469
-0.05
POSITIVE LOGITS
^K
0.07
ednou
0.07
arges
0.07
.are
0.07
rones
0.07
inned
0.07
aldi
0.07
é¡Ķ
0.07
DCF
0.07
achel
0.07
Activations Density 0.000%