INDEX
Negative Logits
a
0.96
<0x0D>
0.92
א
0.85
in
0.85
い
0.81
</h2>
0.80
expected
0.79
er
0.73
ands
0.73
asli
0.71
POSITIVE LOGITS
for
1.02
compensates
0.89
Compens
0.76
wwww
0.75
وين
0.72
ozione
0.72
InEx
0.71
㍉
0.71
0.71
compensating
0.70
Activations Density 0.001%