INDEX
Explanations
names of individuals, specifically the name "Emily" in this case
New Auto-Interp
Negative Logits
ername
-0.56
xual
-0.53
lasses
-0.51
tenance
-0.49
stood
-0.49
kef
-0.49
emonium
-0.48
lay
-0.48
specificity
-0.48
nesses
-0.47
POSITIVE LOGITS
Dickinson
0.81
Lak
0.76
gdala
0.58
otte
0.57
pton
0.55
gown
0.55
sburg
0.55
endi
0.53
ivered
0.50
issance
0.50
Activations Density 8.755%