INDEX
Explanations
proper nouns, specifically names of political figures
New Auto-Interp
Negative Logits
omaly
-0.82
IENT
-0.78
Parenthood
-0.77
ajor
-0.76
ibles
-0.76
Native
-0.73
izations
-0.70
oard
-0.70
orage
-0.69
onyms
-0.69
POSITIVE LOGITS
Jinping
1.19
Hollande
1.17
Macron
0.87
appoint
0.85
congratulated
0.82
confid
0.80
administration
0.80
reelection
0.79
congrat
0.79
aide
0.77
Activations Density 0.009%