INDEX
Explanations
punctuation symbols, especially periods and exclamation marks
New Auto-Interp
Negative Logits
ÐIJÑĢÑħÑĸв
-0.15
urtles
-0.14
«ĺ
-0.14
ürlich
-0.13
vä
-0.13
allet
-0.13
-Compatible
-0.13
gamber
-0.13
oler
-0.13
Geile
-0.13
POSITIVE LOGITS
and
0.23
which
0.19
to
0.17
for
0.17
of
0.16
with
0.16
that
0.16
as
0.16
but
0.16
or
0.15
Activations Density 1.842%