INDEX
Explanations
names of famous people, especially celebrities
proper nouns, particularly the names of notable individuals
New Auto-Interp
Negative Logits
llah
-0.75
angan
-0.72
iate
-0.71
ial
-0.70
wi
-0.70
ood
-0.70
arter
-0.69
raq
-0.68
irm
-0.68
iences
-0.68
POSITIVE LOGITS
Manson
1.33
Monroe
0.89
Elvis
0.87
Marilyn
0.77
azine
0.75
Osw
0.71
Spice
0.70
Pres
0.70
swick
0.70
osaurs
0.68
Activations Density 0.013%