INDEX
Explanations
punctuation marks in the text, particularly periods and question marks
New Auto-Interp
Negative Logits
endl
-0.14
#ad
-0.14
ãģį
-0.14
akedown
-0.14
ãģĵãģ¨ãģ«
-0.13
ziel
-0.13
isch
-0.13
oucher
-0.13
parc
-0.12
ings
-0.12
POSITIVE LOGITS
but
0.16
and
0.15
er
0.15
esch
0.15
a
0.14
otts
0.14
that
0.14
amina
0.14
tik
0.14
ps
0.13
Activations Density 0.385%