INDEX
    Explanations

    negations or expressions of disbelief

    New Auto-Interp
    Negative Logits
    lined
    -0.69
     Seasons
    -0.68
     Sparrow
    -0.68
     Pric
    -0.67
    itiz
    -0.66
     Cutter
    -0.65
    PDATE
    -0.64
    tons
    -0.63
     Tow
    -0.63
     Cic
    -0.61
    POSITIVE LOGITS
     necessarily
    1.17
     intend
    1.12
     belong
    1.12
     hesitate
    1.08
     exist
    1.07
     condone
    1.03
     appear
    1.02
     distinguish
    1.01
     qualify
    1.01
     endorse
    1.00
    Act Density 0.107%

    No Known Activations