INDEX
Explanations
the number 7, especially paired with other digit characters
New Auto-Interp
Negative Logits
Seven
-1.55
seven
-1.38
7
-1.30
SEVEN
-1.28
Seven
-1.27
Seventy
-1.21
seven
-1.17
Seventh
-1.15
seventh
-1.12
seventy
-1.10
POSITIVE LOGITS
↵
0.74
.
0.66
,
0.63
os
0.57
(
0.53
?
0.50
0.50
:
0.48
in
0.48
it
0.47
Activations Density 1.377%