INDEX
Explanations
phrases related to domestic violence
references to domestic violence and related issues
New Auto-Interp
Negative Logits
UMP
-0.83
aer
-0.76
ãĤ¨ãĥ«
-0.71
erella
-0.69
otto
-0.69
lists
-0.69
*/(
-0.67
peror
-0.67
atus
-0.66
Recipe
-0.65
POSITIVE LOGITS
abuse
1.08
Abuse
1.00
violence
0.99
abuse
0.98
Violence
0.96
abusers
0.87
homicides
0.86
harassment
0.84
violence
0.79
abusive
0.78
Activations Density 0.073%