INDEX
Explanations
sensitive topics related to sexual assault, abuse, and misconduct
New Auto-Interp
Negative Logits
ocument
-0.85
Flavoring
-0.84
BLIC
-0.78
arily
-0.77
IVERS
-0.75
REC
-0.75
Solitaire
-0.74
GOODMAN
-0.68
Dispatch
-0.68
overed
-0.67
POSITIVE LOGITS
volent
1.16
ejac
1.06
genital
1.00
condom
0.99
sex
0.99
genitals
0.98
ager
0.97
condoms
0.93
underwear
0.91
rape
0.89
Activations Density 4.128%