INDEX
Explanations
words related to laws, crimes, and punishments
terms related to legal penalties and offenses
New Auto-Interp
Negative Logits
romy
-0.72
iance
-0.72
otin
-0.71
ynthesis
-0.70
hole
-0.69
umbn
-0.68
hall
-0.68
eds
-0.67
ouf
-0.67
por
-0.66
POSITIVE LOGITS
punishable
1.40
punished
0.95
punish
0.92
unfocusedRange
0.91
punishment
0.86
notor
0.84
Penalty
0.83
lashes
0.79
penalties
0.77
punishments
0.77
Activations Density 0.014%