INDEX
Negative Logits
digitally
-0.08
Pauline
-0.08
_gender
-0.08
nica
-0.08
decimal
-0.08
genders
-0.08
gender
-0.08
Gender
-0.08
است
-0.08
Decimal
-0.08
POSITIVE LOGITS
dominate
0.10
worst
0.10
heuristic
0.10
Worst
0.10
Worst
0.10
Estimates
0.09
dominates
0.09
peb
0.09
/compiler
0.09
estimates
0.09
Activations Density 0.011%