INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    amps
    -0.69
    olated
    -0.66
     Checking
    -0.66
    CTV
    -0.66
     Promotion
    -0.65
    Tips
    -0.64
     Fighting
    -0.60
     imagining
    -0.59
     Canary
    -0.59
     Nurs
    -0.58
    POSITIVE LOGITS
     be
    1.02
     become
    1.01
     disappoint
    0.99
     intensify
    0.99
     happen
    0.99
     occur
    0.96
     explode
    0.95
     provoke
    0.95
     someday
    0.94
     collide
    0.94
    Act Density 0.112%

    No Known Activations