INDEX
Explanations
proper names related to legal matters
mentions of female names
New Auto-Interp
Negative Logits
nir
-1.13
lasses
-0.99
rance
-0.86
ugu
-0.83
eling
-0.81
doms
-0.76
indal
-0.75
eled
-0.75
dfx
-0.74
inary
-0.74
POSITIVE LOGITS
Nicole
1.15
Knox
0.96
herself
0.93
Marie
0.91
Marie
0.86
Mae
0.85
Louise
0.83
Judd
0.82
Amanda
0.82
Madison
0.81
Activations Density 0.043%