INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     rampant
    -0.10
    -0.08
     paradise
    -0.08
     (!)
    -0.08
     vp
    -0.07
    Sight
    -0.07
    SYSTEM
    -0.07
    Vamos
    -0.07
     frivol
    -0.07
     Pri
    -0.07
    POSITIVE LOGITS
    0.08
    tone
    0.08
     lign
    0.08
     usher
    0.07
    earned
    0.07
    0.07
    oben
    0.07
     Orch
    0.07
    overn
    0.07
     junk
    0.07
    Act Density 0.592%

    No Known Activations