INDEX
Negative Logits
restriction
-0.08
alta
-0.06
Dum
-0.06
hair
-0.06
پرداز
-0.06
хи
-0.06
.Restr
-0.06
iba
-0.06
ετ
-0.06
ioned
-0.06
POSITIVE LOGITS
fart
0.06
TLabel
0.06
errone
0.06
tournament
0.06
oppression
0.06
넘
0.06
.Commit
0.06
пад
0.05
))↵
0.05
flashlight
0.05
Activations Density 0.007%