INDEX
    Explanations

    expressions of denial or refusal

    New Auto-Interp
    Negative Logits
    texttt
    -0.71
    таж
    -0.66
     Hig
    -0.64
    usk
    -0.63
    tagHelperRunner
    -0.62
    -0.61
     AppCompat
    -0.61
    TableField
    -0.60
    TagHelpers
    -0.60
     Natasha
    -0.59
    POSITIVE LOGITS
     denies
    1.71
     denied
    1.71
     deny
    1.65
     denial
    1.65
     denying
    1.56
    denied
    1.51
     Deny
    1.49
    denial
    1.48
     Denied
    1.48
     Denial
    1.47
    Act Density 0.128%

    No Known Activations