INDEX
Explanations
instances of arrests and police interactions
New Auto-Interp
Negative Logits
ANDROID
-0.16
.unpack
-0.15
ÑĢев
-0.14
merc
-0.14
å±Ĭ
-0.14
scor
-0.14
ινÏĮ
-0.14
ãģĹãģ®
-0.13
äl
-0.13
eyJ
-0.13
POSITIVE LOGITS
arrest
0.66
arrested
0.56
Arrest
0.56
arrests
0.56
arresting
0.50
detain
0.42
appreh
0.42
-ar
0.38
detention
0.37
detained
0.36
Activations Density 0.237%