INDEX
Explanations
punctuation marks, particularly periods
programming language file extensions
New Auto-Interp
Negative Logits
<unused41>
-0.95
ſſung
-0.95
<unused1>
-0.94
<unused3>
-0.94
<unused51>
-0.94
<unused52>
-0.94
<unused47>
-0.94
<unused28>
-0.94
<unused11>
-0.94
<unused14>
-0.94
POSITIVE LOGITS
.
0.57
.
0.47
$.
0.42
。
0.41
).
0.36
*.
0.35
?.
0.34
।
0.34
}.
0.34
_.
0.33
Activations Density 0.025%