INDEX
Negative Logits
favourite
-0.07
nut
-0.07
確
-0.06
polish
-0.06
Wander
-0.06
ruption
-0.06
cabin
-0.06
khô
-0.06
featured
-0.06
Hồng
-0.06
POSITIVE LOGITS
ोख
0.07
DJ
0.06
звер
0.06
действия
0.06
[list
0.06
чемпион
0.06
,!
0.06
Вер
0.06
(dw
0.06
discriminatory
0.06
Activations Density 0.000%