INDEX
Explanations
repeated patterns or sequences of similar values in a numerical context
New Auto-Interp
Negative Logits
TagMode
-1.40
queſta
-1.38
ſchaft
-1.31
<unused52>
-1.29
<unused23>
-1.28
<unused16>
-1.28
<unused41>
-1.28
<unused43>
-1.28
<unused8>
-1.28
<unused14>
-1.28
POSITIVE LOGITS
↵
0.53
…
0.41
’
0.37
.
0.34
1
0.32
...
0.30
“
0.29
0.28
-
0.28
↵↵
0.28
Activations Density 1.235%