INDEX
Explanations
incidents involving theft or destruction
New Auto-Interp
Negative Logits
.nano
-0.17
erno
-0.16
Independ
-0.15
öl
-0.14
ÑĤаб
-0.14
лаÑĩ
-0.14
deadliest
-0.14
wid
-0.14
Haz
-0.14
assassin
-0.14
POSITIVE LOGITS
theft
0.37
stolen
0.36
thieves
0.34
thief
0.32
Theft
0.31
steal
0.29
stealing
0.29
burgl
0.28
burglary
0.28
burg
0.27
Activations Density 0.087%