INDEX
Negative Logits
S
1.39
}$
1.25
H
1.24
N
1.23
P
1.22
;
1.16
is
1.16
C
1.16
=
1.13
B
1.13
POSITIVE LOGITS
are
1.11
separate
1.04
as
0.98
т
0.97
امن
0.91
inl
0.91
да
0.90
etition
0.90
де
0.89
다
0.89
Activations Density 0.087%
S
}$
H
N
P
;
is
C
=
B
are
separate
as
т
امن
inl
да
etition
де
다