INDEX
Explanations
possessive or attributive relation
New Auto-Interp
Negative Logits
h
1.65
(
1.40
ad
1.27
\
1.23
>
1.21
Q
1.19
E
1.18
S
1.16
ض
1.13
Z
1.12
POSITIVE LOGITS
с
2.13
ра
1.80
س
1.50
то
1.47
ли
1.44
ре
1.41
ла
1.29
।
1.25
ס
1.25
ро
1.23
Activations Density 0.065%