INDEX
Negative Logits
Italian
-0.07
Nun
-0.07
Silence
-0.07
quarters
-0.07
Fri
-0.07
remedy
-0.06
discovers
-0.06
disliked
-0.06
yard
-0.06
FRING
-0.06
POSITIVE LOGITS
يج
0.07
-lang
0.06
apro
0.06
acak
0.06
(Target
0.06
польз
0.06
ellipsis
0.06
transc
0.06
Preis
0.06
дов
0.06
Activations Density 0.021%