INDEX
Negative Logits
then
-0.78
now
-0.61
now
-0.59
then
-0.57
lalu
-0.54
Then
-0.53
herself
-0.52
Now
-0.51
damaligen
-0.50
entonces
-0.48
POSITIVE LOGITS
it
0.86
you
0.86
they
0.81
DockStyle
0.72
,
0.71
that
0.68
we
0.66
there
0.64
فريبيس
0.60
consider
0.59
Activations Density 0.037%