INDEX
Explanations
words related to famous names or figures
proper nouns and names associated with music and entertainment figures
New Auto-Interp
Negative Logits
pter
-0.78
ources
-0.77
earch
-0.75
flare
-0.71
emis
-0.70
individual
-0.67
Lauder
-0.66
oppos
-0.66
occup
-0.66
actionDate
-0.65
POSITIVE LOGITS
Bourbon
0.73
Cookie
0.73
zees
0.71
glers
0.66
Cookies
0.66
ingo
0.66
uckle
0.64
abba
0.63
itty
0.62
theless
0.62
Activations Density 0.351%