INDEX
Explanations
phrases related to domestic violence and work-related health issues
New Auto-Interp
Negative Logits
ĸļ
-0.92
ometry
-0.82
itone
-0.79
Tokens
-0.78
iture
-0.77
ulously
-0.76
utical
-0.76
ometer
-0.76
ruary
-0.76
largeDownload
-0.74
POSITIVE LOGITS
disease
1.46
violence
1.37
corruption
1.34
terrorism
1.34
diseases
1.31
infection
1.31
crime
1.30
piracy
1.29
vandalism
1.29
theft
1.28
Activations Density 4.125%