INDEX
Negative Logits
benefit
-0.07
meal
-0.07
Resistance
-0.07
baby
-0.07
Forest
-0.07
officers
-0.07
cohesive
-0.07
Colors
-0.07
Subset
-0.07
Find
-0.07
POSITIVE LOGITS
ง
0.06
псих
0.06
собствен
0.06
.ResponseEntity
0.06
przed
0.06
messed
0.06
jež
0.06
(pad
0.06
Tomáš
0.06
자신
0.06
Activations Density 0.000%