INDEX
    Explanations

    negations, particularly related to actions or behaviors that are not being done

    instances of the word "not" or phrases indicating negation

    New Auto-Interp
    Negative Logits
     Expansion
    -0.70
    stakes
    -0.67
     Pros
    -0.66
     Films
    -0.65
    eers
    -0.64
     Companies
    -0.64
     Circuit
    -0.62
     Kry
    -0.61
     Basics
    -0.61
     Tours
    -0.61
    POSITIVE LOGITS
    icably
    1.41
    epad
    1.19
    icable
    1.15
     necessarily
    1.09
    hin
    1.07
    ched
    0.97
    orious
    0.93
    ifying
    0.91
    ices
    0.89
    necess
    0.88
    Act Density 0.176%

    No Known Activations