INDEX
Explanations
terms related to domestic violence and abusive relationships
New Auto-Interp
Negative Logits
AssemblyCulture
-0.86
ethene
-0.75
riuscito
-0.72
τά
-0.70
ajuato
-0.70
arschijnlijk
-0.70
Stras
-0.69
ima
-0.69
imread
-0.69
grap
-0.69
POSITIVE LOGITS
Domestic
1.04
domestic
1.01
Domestic
0.97
domestic
0.96
domesti
0.86
ESTIC
0.86
domestically
0.81
Household
0.74
household
0.73
……………………
0.72
Activations Density 0.165%