INDEX
Explanations
programming syntax and structures
New Auto-Interp
Negative Logits
ngo
-0.18
Twenty
-0.18
Twenty
-0.17
nineteen
-0.17
33
-0.17
332
-0.17
twenty
-0.17
34
-0.16
19
-0.16
twenty
-0.16
POSITIVE LOGITS
0.41
0.38
0.32
0.30
110
0.26
0.26
0.23
0.23
0.23
116
0.22
Activations Density 0.010%