INDEX
Explanations
legal and medical discussions of gender in discrimination cases
New Auto-Interp
Negative Logits
<bos>
-1.27
purpoſe
-1.22
avoient
-1.16
myſelf
-1.16
itſelf
-1.16
himſelf
-1.12
Efq
-1.12
ſtate
-1.11
Monfieur
-1.10
whoſe
-1.08
POSITIVE LOGITS
-
0.71
↵↵
0.69
rmtree
0.67
<td>
0.66
via
0.63
,
0.59
“
0.55
-
0.55
n
0.54
(
0.54
Activations Density 1.236%