INDEX
Negative Logits
BASED
0.32
ську
0.32
ষ্ট
0.31
svoju
0.31
ري
0.31
(„
0.31
whatever
0.30
based
0.29
িলার
0.29
drugi
0.29
POSITIVE LOGITS
astray
0.70
leads
0.57
lead
0.55
إلى
0.54
to
0.54
到
0.49
到一个
0.48
credence
0.48
منجر
0.48
toa
0.47
Activations Density 0.006%