INDEX
Explanations
references to criminal activity, particularly burglary and theft
New Auto-Interp
Negative Logits
viol
-0.16
nett
-0.15
itas
-0.14
.INPUT
-0.14
ots
-0.14
OTS
-0.14
Lump
-0.13
QU
-0.13
tie
-0.13
.pag
-0.13
POSITIVE LOGITS
stored
0.18
eric
0.18
Stored
0.17
ambi
0.17
stealing
0.17
linger
0.16
cht
0.16
stole
0.16
stolen
0.16
unda
0.16
Activations Density 0.151%