INDEX
Negative Logits
scratching
-0.07
꼭
-0.07
σύν
-0.07
criter
-0.07
shattered
-0.07
isAdmin
-0.06
patiently
-0.06
итом
-0.06
Marian
-0.06
po
-0.06
POSITIVE LOGITS
unlike
0.09
Unlike
0.08
Unlike
0.07
으며
0.06
differently
0.06
YL
0.06
ουλ
0.06
Illegal
0.06
dır
0.06
.down
0.06
Activations Density 0.007%