INDEX
    Explanations

    negations and cautions regarding actions or recommendations

    New Auto-Interp
    Negative Logits
    ux
    -0.15
    AMPL
    -0.15
    ssel
    -0.14
    .pivot
    -0.14
    izon
    -0.14
    tring
    -0.14
    rung
    -0.14
     Gos
    -0.14
     somehow
    -0.14
    ubu
    -0.14
    POSITIVE LOGITS
     exceed
    0.30
     EVER
    0.28
     ever
    0.26
     touch
    0.23
     hesitate
    0.22
     exceeds
    0.22
     worry
    0.22
     пÑĢевÑĭÑĪ
    0.22
     exceeding
    0.21
    -ever
    0.21
    Act Density 0.121%

    No Known Activations