INDEX
Explanations
placeholders and formatting
New Auto-Interp
Negative Logits
a
1.77
ه
1.63
j
1.57
u
1.29
া
1.28
1.22
b
1.16
ج
1.16
l
1.13
d
1.11
POSITIVE LOGITS
as
1.48
;
1.33
"
1.27
innych
1.19
できる
1.18
:
1.17
ות
1.16
ını
1.13
ﺔ
1.10
”
1.09
Activations Density 0.002%