INDEX
Explanations
definite articles in Spanish
New Auto-Interp
Negative Logits
ji
-0.19
\Id
-0.16
alla
-0.15
kud
-0.15
ystone
-0.15
ystack
-0.15
elas
-0.15
oner
-0.14
nin
-0.14
orra
-0.14
POSITIVE LOGITS
amac
0.16
chos
0.15
rox
0.15
esar
0.15
ersh
0.14
Ñĩен
0.14
chod
0.14
upal
0.14
inters
0.14
itchens
0.14
Activations Density 0.015%