INDEX
Explanations
phrases related to political figures and office positions
New Auto-Interp
Negative Logits
arrow
-0.71
complex
-0.67
iw
-0.66
ighed
-0.66
iven
-0.65
polit
-0.63
orry
-0.63
acid
-0.63
rather
-0.62
gru
-0.62
POSITIVE LOGITS
victory
0.98
nomination
0.93
citizenship
0.92
immortality
0.91
presidency
0.90
supremacy
0.90
prizes
0.89
fame
0.87
entry
0.87
redemption
0.87
Activations Density 1.405%