INDEX
Explanations
discussions around neutrality and bias in research
New Auto-Interp
Negative Logits
FetchType
-0.50
ChildScrollView
-0.49
AddTagHelper
-0.48
jednoc
-0.47
Inscrivez
-0.46
kirj
-0.46
Pcd
-0.45
ândia
-0.44
ætter
-0.43
strptime
-0.43
POSITIVE LOGITS
bias
2.04
biased
1.86
Bias
1.71
biases
1.69
bias
1.59
biased
1.56
Bias
1.53
biases
1.33
BIAS
1.30
favoring
1.18
Activations Density 0.609%