INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bru
    -0.07
    ποτε
    -0.07
    .permission
    -0.06
    -0.06
     Adolf
    -0.06
    indexPath
    -0.06
    _DESC
    -0.06
     ofrece
    -0.06
     veut
    -0.06
     Blvd
    -0.06
    POSITIVE LOGITS
     lam
    0.14
     Lam
    0.12
    am
    0.08
     lamin
    0.08
    lam
    0.08
    AM
    0.07
    кам
    0.07
    wagon
    0.07
     noticeably
    0.07
     Hamm
    0.07
    Act Density 0.003%

    No Known Activations