INDEX
Explanations
punctuation marks, particularly periods and asterisks
Tokens surrounded by asterisks
numerical lists or bullet points
New Auto-Interp
Negative Logits
vrolet
-0.91
otheby
-0.83
ousands
-0.82
NUMX
-0.82
uawei
-0.82
vielen
-0.79
stdc
-0.79
cửa
-0.77
ratulations
-0.77
^(@)
-0.77
POSITIVE LOGITS
0.73
↵↵
0.72
*
0.61
*
0.60
The
0.57
When
0.56
·
0.53
li
0.53
by
0.52
<eos>
0.52
Activations Density 0.423%