INDEX
Explanations
instances of individuals being convicted or charged with a crime
New Auto-Interp
Negative Logits
allax
-0.18
aben
-0.17
eref
-0.17
atoire
-0.16
éŁ¿
-0.16
ohana
-0.15
infeld
-0.15
isay
-0.15
ransom
-0.14
olina
-0.14
POSITIVE LOGITS
charged
0.45
charged
0.36
charge
0.34
Charg
0.34
charges
0.31
charging
0.28
Charge
0.27
charge
0.27
found
0.26
charges
0.26
Activations Density 0.093%