INDEX
Explanations
people, places, or events containing the letters "me"
first-person pronouns and related personal references
New Auto-Interp
Negative Logits
kefeller
-0.83
paralle
-0.71
ernels
-0.70
iosity
-0.70
olor
-0.66
rican
-0.64
raviolet
-0.63
hips
-0.63
keyes
-0.62
inyl
-0.62
POSITIVE LOGITS
asure
1.25
anwhile
1.12
zzo
1.06
lda
1.03
isters
0.93
adow
0.91
leon
0.89
adows
0.89
ister
0.89
eting
0.88
Activations Density 0.016%