INDEX
Explanations
derogatory sexualized female terms
New Auto-Interp
Negative Logits
盆
0.40
ассорти
0.37
Laz
0.36
Saf
0.36
ಗಾರ
0.36
roben
0.36
sensory
0.35
ঢাকার
0.35
絢
0.35
Bliss
0.35
POSITIVE LOGITS
bitch
2.09
bitches
1.77
whore
1.55
slut
1.47
slut
1.20
promiscu
1.18
prostitute
1.16
婊
1.16
prostit
1.09
prostitutes
1.01
Activations Density 0.034%