INDEX
Negative Logits
on
1.52
s
1.37
and
1.23
that
1.22
was
1.17
ovat
0.99
is
0.98
ب
0.96
at
0.94
𝘱
0.94
POSITIVE LOGITS
ت
1.50
м
1.04
Сер
1.02
<
1.02
>
0.95
Ми
0.93
Сма
0.93
تس
0.91
[
0.91
<td>
0.91
Activations Density 0.002%
on
s
and
that
was
ovat
is
ب
at
𝘱
ت
м
Сер
<
>
Ми
Сма
تس
[
<td>