INDEX
Negative Logits
is
1.69
un
1.59
es
1.57
’
1.44
ar
1.36
-
1.36
o
1.35
en
1.34
of
1.34
an
1.30
POSITIVE LOGITS
to
1.25
at
1.19
whatnot
1.02
0
1.02
8
1.02
9
0.99
AY
0.90
۰۰
0.89
ам
0.87
\}$,
0.86
Activations Density 0.001%
is
un
es
’
ar
-
o
en
of
an
to
at
whatnot
0
8
9
AY
۰۰
ам
\}$,