INDEX
Explanations
names of political figures
proper nouns related to political figures and their actions
New Auto-Interp
Negative Logits
Kear
-0.75
Native
-0.74
Cary
-0.73
Goat
-0.68
Meat
-0.65
Avery
-0.65
Parenthood
-0.65
Midwest
-0.65
Liver
-0.63
Load
-0.63
POSITIVE LOGITS
Jinping
1.11
enei
1.01
supporters
0.91
confid
0.78
orsi
0.78
isma
0.76
campaigned
0.76
impeachment
0.75
ovich
0.74
appoint
0.74
Activations Density 0.062%