INDEX
Explanations
instances of the word "to" and related infinitive forms
New Auto-Interp
Negative Logits
isco
-0.15
orgot
-0.15
ÑģÑı
-0.15
uzzi
-0.14
FORMAT
-0.14
baÅŁ
-0.14
aya
-0.14
urre
-0.14
Ih
-0.13
rored
-0.13
POSITIVE LOGITS
behold
0.17
echa
0.15
boot
0.15
ryan
0.15
oting
0.14
canh
0.14
EC
0.14
ναν
0.14
/from
0.14
Lager
0.14
Activations Density 0.062%