INDEX
Explanations
punctuation marks that signify the end of sentences
numbers, periods, and specific names
New Auto-Interp
Negative Logits
arangay
-0.71
征詢我
-0.70
ſei
-0.69
niſſe
-0.69
<unused41>
-0.69
<unused42>
-0.68
phazard
-0.68
<unused23>
-0.68
[@BOS@]
-0.68
<unused1>
-0.68
POSITIVE LOGITS
.
0.51
item
0.46
:
0.43
*
0.42
subsection
0.41
دانشنامهٔ
0.40
'.
0.39
</tr>
0.38
*.
0.38
noDo
0.37
Activations Density 0.106%