INDEX
Negative Logits
贴
-0.07
_locked
-0.07
recognizes
-0.06
/H
-0.06
pressed
-0.06
absorbed
-0.06
mediately
-0.06
_ly
-0.06
awful
-0.06
کیلومتر
-0.06
POSITIVE LOGITS
fecha
0.07
Tart
0.07
Độ
0.06
theological
0.06
ПО
0.06
Bạn
0.06
Lincoln
0.06
seb
0.06
Záp
0.06
);↵↵↵↵↵
0.06
Activations Density 0.053%