INDEX
Negative Logits
challenge
-0.08
(batch
-0.07
factor
-0.07
enerji
-0.06
-muted
-0.06
_responses
-0.06
DEFIN
-0.06
Frage
-0.06
евер
-0.06
makin
-0.06
POSITIVE LOGITS
rejo
0.07
plagiar
0.06
Angie
0.06
na
0.06
ข
0.06
exec
0.06
_VAL
0.06
Kim
0.06
Lith
0.06
,无
0.06
Activations Density 0.026%