INDEX
Negative Logits
as
1.20
ला
0.91
SON
0.89
cosmopolitan
0.88
a
0.84
and
0.81
eli
0.80
at
0.80
are
0.80
ре
0.79
POSITIVE LOGITS
in
1.43
is
1.34
m
1.30
i
1.28
at
1.23
u
1.15
ي
1.07
r
1.06
ar
1.01
ad
1.01
Activations Density 0.011%
as
ला
SON
cosmopolitan
a
and
eli
at
are
ре
in
is
m
i
at
u
ي
r
ar
ad