INDEX
Explanations
instances of formatting symbols and punctuation typically used in text
New Auto-Interp
Negative Logits
^(@)
-1.56
تضيفلها
-1.47
yntaxException
-1.44
myſelf
-1.44
NUMX
-1.43
Савезне
-1.42
Мексичка
-1.40
itſelf
-1.39
Efq
-1.35
ſelf
-1.33
POSITIVE LOGITS
<eos>
1.20
↵↵
1.17
↵
1.01
.
0.84
"
0.83
0.83
:
0.82
(
0.77
1
0.75
5
0.74
Activations Density 0.815%