INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     reconstruction
    -0.07
    _redirected
    -0.07
     exploration
    -0.06
     پیشینه
    -0.06
     tragedy
    -0.06
     Minuten
    -0.06
     bus
    -0.06
    ondheim
    -0.06
     Convers
    -0.06
     standing
    -0.06
    POSITIVE LOGITS
     gland
    0.17
     glands
    0.15
     Gel
    0.08
     Sheldon
    0.07
    jj
    0.07
    0.07
    iod
    0.07
     Golden
    0.06
    0.06
    !$
    0.06
    Act Density 0.002%

    No Known Activations