INDEX
Explanations
Proper nouns related to politicians or governmental figures
references to specific individuals involved in political discourse
New Auto-Interp
Negative Logits
eers
-0.84
ghai
-0.80
ndum
-0.77
ighth
-0.76
apore
-0.74
ographed
-0.73
opsis
-0.72
ocol
-0.71
chnology
-0.71
agra
-0.71
POSITIVE LOGITS
Gibbs
1.02
sonian
0.83
hops
0.81
sein
0.72
layer
0.70
LIN
0.70
sim
0.68
step
0.68
ford
0.65
lins
0.65
Activations Density 0.009%