INDEX
Explanations
words related to males or masculinity
references to male entities or characteristics
New Auto-Interp
Negative Logits
etsk
-0.79
GOODMAN
-0.79
leans
-0.75
krit
-0.72
today
-0.71
adr
-0.70
lov
-0.70
anwhile
-0.69
rep
-0.67
bleacher
-0.66
POSITIVE LOGITS
volent
1.70
ejac
0.92
genital
0.90
vol
0.87
Male
0.83
male
0.82
sexuality
0.81
infertility
0.80
circumcision
0.76
males
0.76
Activations Density 0.011%