INDEX
Negative Logits
-door
-0.08
-code
-0.08
calific
-0.08
घोषणा
-0.08
र्जी
-0.08
роб
-0.07
>Error
-0.07
dish
-0.07
바로
-0.07
Door
-0.07
POSITIVE LOGITS
公平
0.13
fairness
0.13
evenly
0.12
equitable
0.12
equally
0.11
balanced
0.10
Fair
0.10
unbiased
0.09
fair
0.09
Balanced
0.09
Activations Density 0.019%