INDEX
Explanations
references to political parties, particularly those related to regional and national affiliations
New Auto-Interp
Negative Logits
aghan
-0.16
imar
-0.16
ais
-0.15
aeda
-0.15
oble
-0.15
unt
-0.14
agan
-0.14
inus
-0.14
reed
-0.14
æij
-0.14
POSITIVE LOGITS
ucha
0.17
osa
0.16
thon
0.15
atk
0.15
ati
0.15
uche
0.15
UCH
0.15
mv
0.15
LIABILITY
0.14
oproject
0.14
Activations Density 0.013%