INDEX
Explanations
proper nouns related to geopolitical events or political figures
New Auto-Interp
Negative Logits
cember
-1.10
inated
-0.94
istries
-0.92
isphere
-0.91
aukee
-0.91
eering
-0.90
bush
-0.90
apult
-0.89
boarding
-0.88
acted
-0.86
POSITIVE LOGITS
anyone
0.98
anybody
0.94
actic
0.92
corrections
0.85
dozens
0.82
Greenberg
0.82
many
0.81
Scarlet
0.80
usual
0.80
iability
0.80
Activations Density 0.285%