INDEX
    Explanations

    instances of refusal or non-compliance

    New Auto-Interp
    Negative Logits
    XmlAccessType
    -0.84
    حياته
    -0.75
    tigt
    -0.70
    IBOutlet
    -0.67
    PathVariable
    -0.64
    BuildContext
    -0.63
    thern
    -0.63
    seca
    -0.62
    adal
    -0.61
    fekt
    -0.61
    POSITIVE LOGITS
     refusé
    0.93
     refusal
    0.92
    Willing
    0.89
     Refuse
    0.89
     refus
    0.89
     Willing
    0.86
     refusing
    0.83
     препратки
    0.83
     refuse
    0.83
     unwilling
    0.82
    Act Density 0.131%

    No Known Activations