INDEX
Explanations
instances of punctuation marks, particularly dashes and ellipses, in the text
New Auto-Interp
Negative Logits
acman
-0.16
åķ
-0.16
ause
-0.15
IGHL
-0.15
ife
-0.15
SOLE
-0.14
eed
-0.14
.dense
-0.14
aspers
-0.14
oul
-0.14
POSITIVE LOGITS
oret
0.17
-uppercase
0.15
sdale
0.14
-*-č↵
0.13
hence
0.13
equals
0.13
fx
0.13
ayah
0.13
wiki
0.13
both
0.13
Activations Density 0.121%