INDEX
    Explanations

    discussions around neutrality and bias in research

    New Auto-Interp
    Negative Logits
     FetchType
    -0.50
    ChildScrollView
    -0.49
    AddTagHelper
    -0.48
    jednoc
    -0.47
     Inscrivez
    -0.46
     kirj
    -0.46
    Pcd
    -0.45
    ândia
    -0.44
    ætter
    -0.43
    strptime
    -0.43
    POSITIVE LOGITS
     bias
    2.04
     biased
    1.86
     Bias
    1.71
     biases
    1.69
    bias
    1.59
    biased
    1.56
    Bias
    1.53
    biases
    1.33
    BIAS
    1.30
     favoring
    1.18
    Act Density 0.609%

    No Known Activations