INDEX
Explanations
references to various forms of violence, including rape and cyber violence
references to sexual assault and related statistics
New Auto-Interp
Negative Logits
Dise
-0.65
sorely
-0.64
Profit
-0.62
Sov
-0.60
rules
-0.60
archs
-0.60
legions
-0.59
Sov
-0.59
pots
-0.58
anarchy
-0.57
POSITIVE LOGITS
âī¥
0.95
spouse
0.92
acquaintance
0.87
respondent
0.83
or
0.83
partner
0.82
themselves
0.81
anten
0.81
oneself
0.76
compared
0.76
Activations Density 0.401%