INDEX
Explanations
negative indicators of bias or discrimination
New Auto-Interp
Negative Logits
UIControlState
-0.60
ודם
-0.60
ESTE
-0.60
бливості
-0.59
Adair
-0.59
Demografie
-0.58
Swann
-0.58
Kimmel
-0.58
civilised
-0.58
sớm
-0.57
POSITIVE LOGITS
LookAnd
0.90
0.89
parsedMessage
0.83
XNUMX
0.75
Stande
0.74
出版年
0.74
RenderAtEndOf
0.73
*)__
0.72
mettant
0.70
pérd
0.70
Activations Density 0.002%