INDEX
Negative Logits
രിച്ചത്
0.50
いつ
0.50
そのため
0.50
னம்
0.48
किससे
0.48
どの
0.47
常に
0.47
ದ್ದರಿಂದ
0.47
そこ
0.46
어떤
0.46
POSITIVE LOGITS
does
0.90
did
0.88
are
0.79
did
0.69
does
0.68
do
0.67
Does
0.66
Does
0.65
is
0.59
Did
0.58
Activations Density 0.035%