INDEX
Explanations
phrases related to violations or infringements of rules, laws, or rights
terms related to the concept of violating laws or rights
New Auto-Interp
Negative Logits
arger
-0.80
azor
-0.72
rike
-0.71
anka
-0.71
mad
-0.70
Tycoon
-0.69
arij
-0.69
aws
-0.69
thora
-0.67
affer
-0.67
POSITIVE LOGITS
unfocusedRange
0.89
violations
0.88
violation
0.80
viol
0.76
violating
0.68
Compliance
0.67
compliance
0.67
inhibition
0.66
Viol
0.64
mson
0.64
Activations Density 0.038%