INDEX
Explanations
references to females, specifically using the word "girl"
references to "girl" in various contexts
New Auto-Interp
Negative Logits
insula
-0.91
aeda
-0.89
contiguous
-0.78
raltar
-0.77
psey
-0.74
aution
-0.74
PDATE
-0.73
erenn
-0.73
etsk
-0.72
rawdownloadcloneembedreportprint
-0.71
POSITIVE LOGITS
Scouts
1.02
hood
0.98
ishly
0.96
girl
0.86
ish
0.86
girls
0.84
girls
0.81
girl
0.80
glers
0.79
bag
0.79
Activations Density 0.017%