INDEX
Explanations
female names
mentions of prominent individuals, particularly focusing on the names Barbara and Paula
New Auto-Interp
Negative Logits
hire
-0.85
*/(
-0.78
atory
-0.77
culosis
-0.77
hattan
-0.77
atories
-0.76
byter
-0.72
olesc
-0.71
heimer
-0.70
sight
-0.68
POSITIVE LOGITS
Lynn
1.08
Marie
1.01
Anne
0.95
Louise
0.94
Mae
0.94
Michelle
0.92
Swanson
0.92
Patricia
0.91
Anne
0.91
Christina
0.91
Activations Density 0.051%