INDEX
Negative Logits
Dense
-0.07
ugi
-0.06
celib
-0.06
.baseUrl
-0.06
application
-0.06
reportedly
-0.06
Nikola
-0.06
.Configuration
-0.06
Ki
-0.06
Ki
-0.06
POSITIVE LOGITS
unfair
0.09
fair
0.08
Fair
0.08
fair
0.07
↵
0.07
pair
0.07
210
0.07
líd
0.07
fairness
0.07
campaign
0.06
Activations Density 0.011%