INDEX
Explanations
phrases related to breaking rules or laws
instances of the phrase "breaking the law."
New Auto-Interp
Negative Logits
eton
-0.69
geries
-0.65
hereafter
-0.63
Inquis
-0.62
oran
-0.60
Reserved
-0.58
eers
-0.58
reau
-0.57
rily
-0.57
utz
-0.56
POSITIVE LOGITS
membranes
0.79
membrane
0.74
protocol
0.73
stride
0.70
cycle
0.70
seal
0.70
habit
0.68
advertisement
0.67
boundaries
0.67
brittle
0.67
Activations Density 0.108%