INDEX
Explanations
political figures, particularly presidents and prime ministers
references to political leaders and their actions or statements
New Auto-Interp
Negative Logits
pear
-0.70
interpol
-0.68
Californ
-0.66
grad
-0.66
Bronx
-0.66
phantom
-0.65
dots
-0.65
feral
-0.64
Ghosts
-0.64
soph
-0.64
POSITIVE LOGITS
Jinping
0.95
imar
0.90
oÄŁan
0.82
vernment
0.80
llor
0.80
uty
0.80
appoint
0.76
luck
0.75
vowed
0.74
appointed
0.73
Activations Density 0.359%