INDEX
Explanations
phrases related to laws, regulations, and enforcement
terms related to imposing rules or restrictions
New Auto-Interp
Negative Logits
sembly
-0.66
adelphia
-0.65
uble
-0.64
pring
-0.64
addin
-0.62
ournals
-0.61
uana
-0.60
connected
-0.59
Room
-0.57
ource
-0.57
POSITIVE LOGITS
restrictions
1.04
curfew
0.97
punishments
0.96
onto
0.94
brakes
0.93
burdens
0.93
draconian
0.91
harsher
0.89
quotas
0.88
penalties
0.88
Activations Density 0.190%