INDEX
Explanations
references to legal or disciplinary actions
mentions of punishment in various contexts
New Auto-Interp
Negative Logits
eds
-0.82
NetMessage
-0.76
soDeliveryDate
-0.75
sonian
-0.72
rote
-0.72
ergy
-0.71
roots
-0.69
zyme
-0.69
gow
-0.67
thus
-0.67
POSITIVE LOGITS
punishment
1.18
punishments
1.05
punished
1.03
punish
1.02
lashes
0.90
inflicted
0.87
sanction
0.85
harshly
0.84
penalty
0.84
torment
0.83
Activations Density 0.015%