INDEX
Explanations
structured numeric elements, likely related to sections or subsections of a document
New Auto-Interp
Negative Logits
</b>
-0.72
</i>
-0.62
<b>
-0.60
<eos>
-0.60
</
-0.60
<i>
-0.57
↵
-0.56
></
-0.56
-0.55
</
-0.53
POSITIVE LOGITS
iii
1.27
VII
1.22
VII
1.22
VIII
1.16
III
1.15
VIII
1.10
XII
1.09
XIII
1.09
XIII
1.08
IX
1.08
Activations Density 0.255%