INDEX
Explanations
references to political figures and their actions or attributes
New Auto-Interp
Negative Logits
alette
-0.16
ÅĻet
-0.15
addCriterion
-0.14
Signed
-0.14
ipmap
-0.14
affer
-0.14
arie
-0.14
guest
-0.14
anarchists
-0.14
uchar
-0.13
POSITIVE LOGITS
political
0.18
prote
0.16
career
0.16
polit
0.16
insider
0.15
asc
0.15
elay
0.15
intrig
0.15
ä»»
0.15
background
0.15
Activations Density 0.127%