INDEX
Explanations
references to criminal activity and associated legal terms
New Auto-Interp
Negative Logits
adel
-0.16
ÑħодиÑĤÑĮ
-0.16
decorators
-0.15
illas
-0.14
iej
-0.14
otts
-0.14
itzer
-0.14
yst
-0.13
Discrim
-0.13
suicidal
-0.13
POSITIVE LOGITS
crime
0.45
crimes
0.39
offense
0.36
crime
0.34
offenses
0.34
Crime
0.31
infra
0.31
offence
0.30
Off
0.29
-cr
0.28
Activations Density 0.112%