INDEX
Negative Logits
os
0.30
itin
0.29
img
0.29
-
0.29
um
0.28
od
0.28
iation
0.27
water
0.27
=
0.26
y
0.26
POSITIVE LOGITS
in
0.34
에
0.27
们的
0.25
as
0.25
के
0.25
ſelf
0.25
در
0.25
at
0.23
into
0.23
면
0.23
Activations Density 0.679%
os
itin
img
-
um
od
iation
water
=
y
in
에
们的
as
के
ſelf
در
at
into
면