INDEX
Explanations
references to police-related activities or incidents
New Auto-Interp
Negative Logits
aight
-0.18
adr
-0.17
nackte
-0.16
Margins
-0.16
army
-0.15
ном
-0.15
pery
-0.14
bane
-0.14
webdriver
-0.14
/key
-0.14
POSITIVE LOGITS
duty
0.24
badge
0.23
rookies
0.22
colleague
0.21
colleagues
0.21
uniform
0.21
duties
0.20
Duty
0.20
patrol
0.20
rookie
0.19
Activations Density 0.203%