INDEX
Negative Logits
are
0.77
was
0.62
{0.58
not
0.58
prejudiced
0.56
severely
0.55
inherently
0.54
zowel
0.53
haired
0.52
supremely
0.52
POSITIVE LOGITS
ת
0.77
thema
0.68
The
0.64
ли
0.63
נו
0.63
n
0.62
ু
0.60
闗
0.59
ла
0.56
time
0.56
Activations Density 0.016%