INDEX
    Explanations

    instances of rejection or refusal in decision-making contexts

    New Auto-Interp
    Negative Logits
    .Annotations
    -0.16
    ERRU
    -0.15
    clare
    -0.15
    ubo
    -0.15
    arkan
    -0.15
    loub
    -0.14
    asje
    -0.14
    ctl
    -0.14
    ;br
    -0.14
    ENCIL
    -0.14
    POSITIVE LOGITS
     offers
    0.24
    Reject
    0.23
     offer
    0.23
     Reject
    0.23
     reject
    0.23
     rejected
    0.23
     rejection
    0.23
    reject
    0.22
     offered
    0.22
     declined
    0.21
    Act Density 0.125%

    No Known Activations