INDEX
Negative Logits
herd
-0.08
dará
-0.08
ş
-0.08
dira
-0.08
da
-0.08
weak
-0.07
şek
-0.07
kuwa
-0.07
Ш
-0.07
isi
-0.07
POSITIVE LOGITS
Musical
0.09
gotten
0.08
musical
0.08
Doesn
0.08
worked
0.08
Mus
0.07
достой
0.07
ős
0.07
ochond
0.07
,col
0.07
Activations Density 1.324%