INDEX
Explanations
references to political parties and their influence on governance
New Auto-Interp
Negative Logits
nect
-0.19
s
-0.16
emos
-0.15
ister
-0.15
ri
-0.15
bane
-0.15
ij
-0.14
lider
-0.14
emer
-0.14
het
-0.14
POSITIVE LOGITS
ing
0.18
opleft
0.17
ament
0.16
unately
0.15
allax
0.15
-sponsored
0.14
Qué
0.14
icular
0.14
ynom
0.14
hood
0.14
Activations Density 0.031%