INDEX
Explanations
references to political figures and locations
references to political events and manifesto launches
New Auto-Interp
Negative Logits
aughs
-0.77
geist
-0.74
neys
-0.73
RM
-0.72
endum
-0.69
imer
-0.68
Digest
-0.66
urally
-0.66
ndum
-0.65
$.
-0.65
POSITIVE LOGITS
æ©
0.86
å·
0.79
èĪ
0.79
æī
0.72
ãģ®å
0.72
)=(
0.71
äºĶ
0.70
ãģ®ç
0.70
luaj
0.70
soType
0.69
Activations Density 0.117%