INDEX
Explanations
instances of bias and corruption in political or judicial contexts
New Auto-Interp
Negative Logits
StringTokenizer
-0.65
centralwidget
-0.63
exitRule
-0.61
complexContent
-0.58
debout
-0.57
Cataloging
-0.57
קישורים
-0.56
SequentialGroup
-0.56
CreateModel
-0.55
publicPath
-0.53
POSITIVE LOGITS
bias
0.71
biased
0.69
<=",
0.58
biased
0.56
sway
0.56
favor
0.56
BIAS
0.55
bribe
0.54
Bias
0.53
biases
0.53
Activations Density 0.513%