INDEX
Explanations
the word "to," indicating actions or purposes
allows to verb
New Auto-Interp
Negative Logits
the
-0.43
,
-0.43
what
-0.35
mundiales
-0.31
têtes
-0.30
histórica
-0.29
inqui
-0.29
official
-0.29
news
-0.28
refuer
-0.28
POSITIVE LOGITS
utafitiHapana
0.95
出版年
0.89
<unused41>
0.88
[@BOS@]
0.88
<unused51>
0.88
<unused16>
0.88
<unused43>
0.88
<unused42>
0.88
<unused3>
0.88
<unused14>
0.88
Activations Density 0.022%