INDEX
Explanations
names of individuals, particularly prominent women
New Auto-Interp
Negative Logits
uko
-0.15
mux
-0.15
ucer
-0.14
çĵľ
-0.14
arde
-0.14
RuleContext
-0.14
evi
-0.14
stad
-0.14
tuk
-0.14
stin
-0.14
POSITIVE LOGITS
Ann
0.26
ann
0.24
Ann
0.23
ann
0.21
-An
0.20
Sue
0.19
Lynn
0.19
Anne
0.18
ANN
0.17
ANN
0.17
Activations Density 0.041%