INDEX
Negative Logits
כאשר
0.72
此外
0.69
अथवा
0.66
Furthermore
0.64
どのような
0.62
Additionally
0.62
различ
0.59
Subsequently
0.59
तथा
0.59
כך
0.58
POSITIVE LOGITS
want
1.06
haven
1.00
dunno
0.98
know
0.93
wanna
0.92
havent
0.91
đang
0.89
wants
0.88
messed
0.88
HAVE
0.88
Activations Density 0.771%