INDEX
Explanations
references to criminal activities and legal terminology related to crime
New Auto-Interp
Negative Logits
Crime
-0.20
Crime
-0.20
criminals
-0.18
crime
-0.18
criminal
-0.18
crime
-0.18
ettle
-0.18
Criminal
-0.18
crim
-0.17
criminal
-0.17
POSITIVE LOGITS
ity
0.35
istics
0.25
izing
0.24
ization
0.24
izes
0.24
ized
0.24
ize
0.22
isation
0.21
justice
0.21
iza
0.20
Activations Density 0.007%