INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Vish
    -0.06
     fourn
    -0.06
     reverence
    -0.06
    Creat
    -0.06
     Mec
    -0.06
    _pl
    -0.06
     PURE
    -0.06
     overl
    -0.06
     produce
    -0.06
     equiv
    -0.06
    POSITIVE LOGITS
     longtime
    0.15
    .faces
    0.08
     Say
    0.07
     longstanding
    0.07
    Sign
    0.07
    0.07
    .amount
    0.07
     drinking
    0.06
     tout
    0.06
     Intelli
    0.06
    Act Density 0.007%

    No Known Activations