INDEX
Explanations
verbs or phrases related to making changes or adjustments
elements associated with changes in structure or rules
New Auto-Interp
Negative Logits
ymph
-0.64
GV
-0.63
inar
-0.61
alone
-0.60
interstitial
-0.60
laughs
-0.59
âĹ¼
-0.59
heny
-0.58
Bei
-0.57
contained
-0.57
POSITIVE LOGITS
accordingly
1.24
drastically
1.22
dramatically
1.09
radically
1.08
unilaterally
0.88
substantially
0.86
forever
0.82
overnight
0.81
subtly
0.80
abruptly
0.80
Activations Density 0.241%