INDEX
Negative Logits
is
0.90
are
0.70
üp
0.63
it
0.62
很多
0.57
k
0.57
üyor
0.57
ti
0.55
みが
0.54
to
0.54
POSITIVE LOGITS
на
0.91
1
0.89
р
0.89
3
0.77
د
0.77
น
0.76
us
0.75
м
0.71
2
0.71
(
0.69
Activations Density 0.001%
is
are
üp
it
很多
k
üyor
ti
みが
to
на
1
р
3
د
น
us
м
2
(