INDEX
Explanations
instances of the word "girl" or "girls" in a text
references to girls and young females
New Auto-Interp
Negative Logits
rehend
-0.87
glas
-0.82
PDATE
-0.79
rehens
-0.77
psey
-0.76
nce
-0.76
utherford
-0.74
fusc
-0.73
raltar
-0.73
umbnail
-0.70
POSITIVE LOGITS
girls
1.02
panties
1.01
girls
0.99
girl
0.99
dolls
0.94
Scouts
0.93
Girls
0.88
Girls
0.84
girl
0.82
daughters
0.79
Activations Density 0.048%