INDEX
Explanations
mentions of law enforcement incidents
New Auto-Interp
Negative Logits
resil
-0.73
cens
-0.68
arson
-0.68
Levant
-0.68
bombard
-0.67
dish
-0.67
corrections
-0.67
behavi
-0.67
redevelop
-0.64
shorth
-0.64
POSITIVE LOGITS
fw
1.15
df
1.12
bm
1.09
zx
1.07
zn
1.06
bc
1.04
fb
1.04
cd
1.03
wb
1.01
gd
0.97
Activations Density 0.050%