INDEX
Negative Logits
i
-0.75
when
-0.69
When
-0.65
o
-0.63
s
-0.60
cuando
-0.60
e
-0.60
ed
-0.56
hesis
-0.56
quando
-0.55
POSITIVE LOGITS
you
0.93
we
0.89
the
0.82
it
0.80
someone
0.72
there
0.71
I
0.66
Do
0.66
discussing
0.65
a
0.65
Activations Density 0.066%