INDEX
Explanations
verbs related to implementing or introducing a change or action
actions related to implementing changes or introducing new elements
New Auto-Interp
Negative Logits
pring
-0.77
eem
-0.75
ems
-0.75
ogi
-0.72
esome
-0.72
Present
-0.72
present
-0.72
edly
-0.71
fortune
-0.70
ulet
-0.70
POSITIVE LOGITS
stricter
0.95
fewer
0.89
rid
0.89
additional
0.87
tighter
0.86
existing
0.83
tougher
0.83
moratorium
0.83
foreigners
0.79
restrictions
0.78
Activations Density 0.299%