INDEX
Negative Logits
are
1.34
in
1.22
e
1.06
\
1.02
disappoint
0.96
to
0.95
as
0.94
so
0.93
câștig
0.89
be
0.87
POSITIVE LOGITS
n
2.08
m
2.03
b
1.73
is
1.73
p
1.56
ac
1.42
r
1.42
y
1.37
৭
1.30
f
1.29
Activations Density 0.067%
are
in
e
\
disappoint
to
as
so
câștig
be
n
m
b
is
p
ac
r
y
৭
f