INDEX
Explanations
words related to sexual misconduct or abuse
references to sexual misconduct or exploitation
New Auto-Interp
Negative Logits
quickShipAvailable
-0.93
GOODMAN
-0.85
eer
-0.70
arta
-0.69
Dispatch
-0.68
IELD
-0.67
*/(
-0.65
Grants
-0.64
respir
-0.63
oru
-0.63
POSITIVE LOGITS
transmitted
0.95
assaulted
0.91
abusing
0.90
assaulting
0.88
oriented
0.83
abused
0.80
ensl
0.79
sexually
0.78
mol
0.77
harassed
0.77
Activations Density 0.007%