INDEX
Explanations
comments and annotations in code
New Auto-Interp
Negative Logits
(
-0.16
is
-0.15
-
-0.15
Bun
-0.15
olo
-0.15
racks
-0.15
ya
-0.14
Pent
-0.14
.
-0.14
aks
-0.14
POSITIVE LOGITS
~-~-~-~-
0.20
Č↵
0.20
eof
0.19
-BEGIN
0.17
BOOLE
0.17
UTILITY
0.16
convenience
0.16
\/\/
0.16
helpers
0.15
constants
0.15
Activations Density 0.070%