INDEX
Explanations
names of individuals in different contexts
names of people and characters
New Auto-Interp
Negative Logits
nai
-0.83
ĩ
-0.82
uments
-0.74
kaya
-0.73
ENCY
-0.73
ents
-0.72
acters
-0.69
nance
-0.67
ences
-0.67
omen
-0.67
POSITIVE LOGITS
Moor
0.86
hyde
0.76
McF
0.75
tera
0.74
hattan
0.73
thur
0.66
erva
0.66
anqu
0.65
Buchanan
0.65
deals
0.64
Activations Density 0.034%