INDEX
Negative Logits
-nine
-0.09
Timing
-0.09
=================================================================
-0.09
genial
-0.09
Hoi
-0.09
Timing
-0.09
vooruit
-0.08
.sound
-0.08
cocina
-0.08
Cocina
-0.08
POSITIVE LOGITS
supervised
0.08
support
0.08
comprom
0.07
desired
0.07
imbal
0.07
TF
0.07
זרת
0.07
interventions
0.07
suffice
0.07
вруч
0.07
Activations Density 0.005%