INDEX
Explanations
proper nouns related to locations or people
proper nouns, particularly names associated with locations and individuals
New Auto-Interp
Negative Logits
retty
-0.92
ichick
-0.83
idable
-0.80
INTON
-0.74
inqu
-0.73
estern
-0.70
okia
-0.68
ablishment
-0.68
cohol
-0.68
GAN
-0.68
POSITIVE LOGITS
leness
0.84
erer
0.78
elf
0.75
rates
0.74
ures
0.71
matter
0.68
lighting
0.67
erers
0.67
rate
0.66
shire
0.64
Activations Density 0.062%