INDEX
Explanations
punctuation marks, particularly periods and quotation marks
New Auto-Interp
Negative Logits
...↵↵
-0.15
(↵↵
-0.14
—↵↵
-0.14
“
-0.14
بÙĬÙĨ
-0.14
â̦↵↵
-0.14
&
-0.14
--↵↵
-0.13
”
-0.13
.lua
-0.13
POSITIVE LOGITS
.↵
0.20
ा.↵
0.16
â̬↵
0.15
."↵
0.14
).↵
0.13
avo
0.13
comed
0.13
à¥Ī.↵
0.13
ี↵
0.13
ë§¹
0.13
Activations Density 1.024%