INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Yang
    -0.08
    vis
    -0.08
     hum
    -0.08
     PLA
    -0.08
    -0.07
    .restore
    -0.07
    croll
    -0.07
    was
    -0.07
    AD
    -0.07
    .am
    -0.07
    POSITIVE LOGITS
    kick
    0.08
    ್�
    0.08
    0.07
     Ald
    0.07
     Mater
    0.07
     conj
    0.07
    ाख
    0.07
     grains
    0.07
    ल्य
    0.07
     Carl
    0.07
    Act Density 0.017%

    No Known Activations