INDEX
Explanations
political terms and proper nouns related to influential figures
words related to political themes and controversies
New Auto-Interp
Negative Logits
icularly
-0.91
umin
-0.90
ework
-0.88
ettes
-0.87
clud
-0.85
estyles
-0.85
oso
-0.85
eteenth
-0.84
aceous
-0.82
iard
-0.80
POSITIVE LOGITS
Passage
0.86
ank
0.76
OWN
0.71
uyomi
0.70
APE
0.68
Seas
0.66
rouse
0.64
é¾įå¥ij士
0.63
Goldstein
0.62
Downs
0.61
Activations Density 0.025%