INDEX
Explanations
references to reform, particularly in economic and policy contexts
New Auto-Interp
Negative Logits
ially
-0.18
kest
-0.18
emas
-0.16
hood
-0.16
oub
-0.15
IAL
-0.15
fines
-0.14
formal
-0.14
formally
-0.14
chy
-0.14
POSITIVE LOGITS
atted
0.28
ative
0.27
ulated
0.23
ulate
0.21
ulation
0.20
atories
0.20
ers
0.19
ulating
0.19
idable
0.19
/update
0.17
Activations Density 0.014%