INDEX
Explanations
list items or numbered options
New Auto-Interp
Negative Logits
2.11
ς
1.50
s
1.23
Н
1.20
심히
1.16
ählt
1.16
this
1.14
К
1.13
C
1.10
С
1.08
POSITIVE LOGITS
.}$
1.56
。“
1.53
v
1.53
ج
1.52
ર
1.44
.}
1.41
라면
1.41
্তে
1.35
涳
1.34
.“
1.34
Activations Density 0.631%