INDEX
Explanations
references to specific historical dates and numeric identifiers
New Auto-Interp
Negative Logits
third
-0.36
three
-0.32
3
-0.32
03
-0.31
Third
-0.29
third
-0.29
ä¸ī
-0.29
第ä¸ī
-0.28
³
-0.28
ä¸ī
-0.28
POSITIVE LOGITS
6
0.38
5
0.31
7
0.28
sixth
0.27
Sixth
0.27
ï¼ĸ
0.27
Û¶
0.24
åħŃ
0.24
६
0.23
åħŃ
0.23
Activations Density 0.078%