INDEX
Explanations
instances of punctuation and periods indicating sentence endings
New Auto-Interp
Negative Logits
addock
-0.17
onica
-0.17
İZ
-0.16
ivas
-0.14
kelig
-0.14
.ci
-0.14
podium
-0.14
Dawson
-0.14
rike
-0.13
odule
-0.13
POSITIVE LOGITS
759
0.16
loth
0.15
ÙĪÙ¾
0.14
oad
0.14
edral
0.14
̧
0.14
ania
0.14
ells
0.13
ãĥĥãĤ·ãĥ¥
0.13
isque
0.13
Activations Density 0.012%