INDEX
    Explanations

    phrases indicating negation or denial

    negations or phrases expressing denial or exclusion

    New Auto-Interp
    Negative Logits
    ushes
    -0.71
    uan
    -0.66
     Tens
    -0.65
    Ö¼
    -0.65
    Ĥİ
    -0.64
    ourney
    -0.64
    riber
    -0.63
    awks
    -0.63
    velt
    -0.61
    anders
    -0.60
    POSITIVE LOGITS
     necessarily
    1.09
    icably
    1.03
    hin
    1.03
    eworthy
    1.02
    orious
    1.00
     permitted
    0.99
    icable
    0.98
     amused
    0.95
     yet
    0.94
    ifying
    0.90
    Act Density 0.126%

    No Known Activations