INDEX
Explanations
punctuation or sentence-ending structures
New Auto-Interp
Negative Logits
aniem
-0.17
urance
-0.17
igkeit
-0.15
(«
-0.15
’s
-0.15
(“
-0.15
:
-0.14
=-=-=-=-
-0.14
,
-0.14
↵
-0.14
POSITIVE LOGITS
cord
0.15
fter
0.14
uci
0.13
Laughs
0.13
ýv
0.13
tbl
0.13
quoting
0.13
infeld
0.12
ichten
0.12
',(
0.12
Activations Density 0.031%