INDEX
Explanations
concepts related to fairness and justice
New Auto-Interp
Negative Logits
heretofore
-0.43
Bernadette
-0.43
packageName
-0.41
MDC
-0.40
uride
-0.40
Rocco
-0.40
DTM
-0.40
ederen
-0.40
IMS
-0.39
Proto
-0.39
POSITIVE LOGITS
Fair
1.42
fair
1.41
Fair
1.39
fair
1.38
unfair
1.27
fairness
1.27
Fairness
1.24
FAIR
1.23
FAIR
1.21
fairer
1.16
Activations Density 0.079%