INDEX
Explanations
instances of the word "rape" or related terms in the context of accusations and discussions surrounding sexual violence
rape and sexual violence
New Auto-Interp
Negative Logits
Buena
-0.54
Motif
-0.54
Festi
-0.50
Fir
-0.49
Helios
-0.49
Lottie
-0.49
Outsider
-0.49
arity
-0.48
MCU
-0.48
Bü
-0.48
POSITIVE LOGITS
rape
2.16
Rape
1.83
Rape
1.77
raped
1.76
rape
1.68
raping
1.67
rapist
1.35
violación
1.09
rap
0.79
sexual
0.75
Activations Density 0.003%