INDEX
Explanations
terms related to sexual abuse, allegations, and assault
New Auto-Interp
Negative Logits
travel
-0.87
peror
-0.80
izen
-0.73
views
-0.67
ocular
-0.67
iasm
-0.66
rian
-0.65
AY
-0.65
uber
-0.65
vision
-0.65
POSITIVE LOGITS
perpetrated
1.15
victims
1.10
Victim
0.98
victim
0.95
Victims
0.94
Rape
0.92
survivors
0.90
raped
0.90
inflicted
0.90
rape
0.87
Activations Density 0.971%