INDEX
Explanations
concepts related to domestic violence and mental health
New Auto-Interp
Negative Logits
nez
-0.17
gio
-0.16
dÄĽ
-0.14
vrier
-0.14
ursed
-0.14
utron
-0.14
nable
-0.14
deaux
-0.14
uren
-0.14
.getTag
-0.13
POSITIVE LOGITS
domestic
0.52
Domestic
0.49
batter
0.42
Batter
0.38
violence
0.37
domest
0.37
Dom
0.35
abuse
0.35
/dom
0.34
battered
0.34
Activations Density 0.095%