INDEX
Explanations
proper nouns related to individuals or organizations
proper nouns, particularly names that appear frequently in the text
New Auto-Interp
Negative Logits
orie
-0.92
eric
-0.85
oral
-0.84
othe
-0.73
aned
-0.73
ocrats
-0.72
atics
-0.71
inx
-0.70
othes
-0.70
ane
-0.69
POSITIVE LOGITS
gerald
1.01
lein
0.85
Braun
0.81
heimer
0.78
Bunny
0.74
patrick
0.73
wald
0.73
ufact
0.73
hler
0.71
minster
0.71
Activations Density 0.024%