INDEX
Explanations
phrases indicating the implementation of rules or laws
instances of the phrase "go into effect."
New Auto-Interp
Negative Logits
Corpus
-0.67
Primordial
-0.67
asca
-0.67
Side
-0.64
ides
-0.64
Stain
-0.64
pi
-0.62
Crus
-0.60
stained
-0.60
IDES
-0.60
POSITIVE LOGITS
uate
0.92
uation
0.89
uated
0.86
angering
0.85
uating
0.83
ional
0.80
||||
0.77
ures
0.74
curfew
0.73
ually
0.72
Activations Density 0.049%