INDEX
Explanations
numerical values and related scientific data
New Auto-Interp
Negative Logits
twenty
-0.67
thirteenth
-0.65
thirty
-0.64
forty
-0.64
fifteen
-0.64
eighteen
-0.63
Thirteenth
-0.62
Forty
-0.61
milliers
-0.61
Thirty
-0.61
POSITIVE LOGITS
TagMode
0.68
5
0.63
6
0.63
4
0.63
]")]
0.61
3
0.61
7
0.60
8
0.60
delwed
0.56
9
0.55
Activations Density 0.496%