INDEX
Explanations
phrases indicating arrests or criminal activities
phrases indicating reasons for arrests
New Auto-Interp
Negative Logits
ires
-0.76
itle
-0.72
atar
-0.72
8000
-0.70
Options
-0.69
eous
-0.69
erd
-0.68
omics
-0.67
soDeliveryDate
-0.67
Born
-0.66
POSITIVE LOGITS
violating
1.39
breaching
1.23
gery
1.18
refusing
1.16
assaulting
1.12
allegedly
1.12
interfering
1.12
inciting
1.11
failing
1.06
tresp
1.06
Activations Density 0.068%