INDEX
Negative Logits
k
1.58
y
1.33
p
1.26
c
1.24
l
1.23
v
1.16
an
1.15
<0x0D>
1.07
x
1.05
kamer
1.05
POSITIVE LOGITS
neutral
1.48
Neutral
1.48
Neutral
1.40
neutral
1.32
ла
1.25
нейтра
1.23
یک
1.18
neutrals
1.12
neutrality
1.11
ﮄ
1.05
Activations Density 0.015%