INDEX
Explanations
syntactical structures and code-like syntax elements
New Auto-Interp
Negative Logits
8
-0.32
åħ«
-0.29
åħ«
-0.28
eight
-0.27
August
-0.25
eight
-0.24
Aug
-0.23
Eight
-0.23
Eight
-0.23
08
-0.23
POSITIVE LOGITS
0.29
0.28
107
0.21
105
0.20
0.19
0.18
Seven
0.17
0.16
↵ ↵
0.16
Juli
0.16
Activations Density 0.030%