INDEX
Explanations
references to women and commands in programming contexts
New Auto-Interp
Negative Logits
sc
-0.33
sl
-0.32
sa
-0.31
se
-0.31
st
-0.29
sk
-0.29
sw
-0.28
sp
-0.28
sm
-0.28
sh
-0.28
POSITIVE LOGITS
S
0.24
¡
0.23
SB
0.22
SX
0.21
SZ
0.21
SJ
0.21
½
0.21
SV
0.20
SG
0.20
SID
0.20
Activations Density 0.154%