INDEX
Explanations
references to legal issues and penalties related to crimes and regulations
New Auto-Interp
Negative Logits
ysa
-0.16
lum
-0.14
ren
-0.14
sold
-0.13
ilo
-0.13
cert
-0.13
azor
-0.13
esinin
-0.13
ela
-0.13
iet
-0.13
POSITIVE LOGITS
penalty
0.46
penalties
0.43
Penalty
0.34
penal
0.33
fines
0.30
punishable
0.29
punishment
0.28
_penalty
0.28
pen
0.27
punishments
0.26
Activations Density 0.287%