INDEX
    Explanations

    mentions related to sickness or illness

    expressions of illness or dissatisfaction, particularly in a colloquial context

    New Auto-Interp
    Negative Logits
     unlaw
    -0.75
     compr
    -0.66
     Unch
    -0.66
    wcsstore
    -0.66
    adr
    -0.64
    ETHOD
    -0.63
     sanctioned
    -0.60
    merce
    -0.60
     rul
    -0.60
     Hier
    -0.59
    POSITIVE LOGITS
    ening
    1.29
    ened
    1.18
    bay
    1.16
    nesses
    0.96
    ly
    0.93
    est
    0.89
    ert
    0.88
    etts
    0.86
    le
    0.85
    ness
    0.85
    Act Density 0.019%

    No Known Activations