INDEX
Explanations
references to children, educational qualifications, and gender-related terms
preceding possessive "'s"
children's, bachelor's, degrees
New Auto-Interp
Negative Logits
"
-0.68
(
-0.60
-0.57
'
-0.55
“
-0.55
.
-0.54
and
-0.54
,
-0.54
A
-0.53
in
-0.52
POSITIVE LOGITS
myſelf
0.90
’
0.86
Monfieur
0.86
himſelf
0.86
Theſe
0.81
՚
0.81
―――――
0.80
0.76
ſelf
0.75
itſelf
0.75
Activations Density 0.145%