INDEX
Explanations
punctuation marks
punctuation marks, particularly commas and colons
New Auto-Interp
Negative Logits
tarian
-0.75
backer
-0.74
atively
-0.74
ensible
-0.73
arted
-0.73
obook
-0.72
RF
-0.71
ory
-0.70
inking
-0.70
ought
-0.70
POSITIVE LOGITS
qui
1.09
il
1.05
si
1.04
et
1.01
eh
0.99
ja
0.98
tu
0.98
ni
0.97
la
0.95
nun
0.92
Activations Density 0.048%