INDEX
Explanations
phrases related to editing and textual changes in literature
New Auto-Interp
Negative Logits
-1.44
''
-1.37
”—
-1.30
.”—
-1.29
—-
-1.28
〜
-1.19
\-
-1.15
—
-1.15

-1.14
\_
-1.12
POSITIVE LOGITS
–
2.58
–,
1.40
أيضاً
1.24
്
1.15
اً
1.10
راً
1.08
ায়
1.07
ര്
1.05
ല്
1.04
باً
1.03
Activations Density 0.439%