INDEX
Explanations
references to political parties and their representatives
New Auto-Interp
Negative Logits
aight
-0.17
kå
-0.16
STITUTE
-0.15
ares
-0.14
å»¶
-0.14
acons
-0.14
-plugins
-0.14
ãĥŃãĥ¼
-0.14
avan
-0.14
munition
-0.14
POSITIVE LOGITS
party
0.29
parties
0.24
Party
0.23
Parties
0.21
Party
0.21
PARTY
0.21
party
0.20
spoiler
0.19
nist
0.19
.party
0.19
Activations Density 0.078%