INDEX
Explanations
proper names
references to people's names
New Auto-Interp
Negative Logits
awar
-0.74
*/(
-0.74
unin
-0.74
hire
-0.72
heimer
-0.72
umbn
-0.71
olesc
-0.70
shit
-0.69
abol
-0.69
fare
-0.68
POSITIVE LOGITS
Lynn
1.04
Louise
1.01
Patricia
1.00
Marie
0.98
Nicole
0.95
Jane
0.92
Garcia
0.92
Gloria
0.91
Lopez
0.91
Sue
0.91
Activations Density 0.065%