INDEX
Explanations
end of sentence punctuation
New Auto-Interp
Negative Logits
but
-1.64
a
-1.31
hline
-1.08
and
-1.08
some
-1.06
in
-1.05
one
-1.03
尙
-1.03
𝐳
-1.00
an
-1.00
POSITIVE LOGITS
the
2.31
từ
1.49
from
1.35
první
1.34
firstly
1.33
настолько
1.31
tellement
1.30
such
1.27
tão
1.26
such
1.23
Activations Density 0.011%