INDEX
Explanations
symbols and punctuation marks used in mathematical or technical contexts
Numbers, especially single digits
numbers and lists
New Auto-Interp
Negative Logits
Six
-0.66
Six
-0.65
Nine
-0.65
SIX
-0.63
Nine
-0.60
Eight
-0.60
Eight
-0.59
Seven
-0.57
nine
-0.56
Twenty
-0.55
POSITIVE LOGITS
3
1.13
1
1.10
2
1.06
4
1.05
5
0.98
6
0.92
7
0.90
8
0.85
9
0.80
0
0.71
Activations Density 0.101%