INDEX
Explanations
references to girls and gender-related themes
girl and boy
New Auto-Interp
Negative Logits
retirees
-0.40
utilizzando
-0.38
using
-0.35
utilizando
-0.34
retired
-0.34
unak
-0.33
usando
-0.33
utilising
-0.33
kinerja
-0.33
Begegn
-0.32
POSITIVE LOGITS
girl
1.29
girls
1.26
Girls
1.23
Girls
1.22
boys
1.22
boy
1.20
GIRLS
1.18
Girl
1.16
girls
1.16
girl
1.14
Activations Density 0.028%