INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -first
    -0.07
     Elem
    -0.07
     barn
    -0.07
     inverted
    -0.06
    datable
    -0.06
    -
    -0.06
     Cary
    -0.06
     Barn
    -0.06
    -0.06
    ysa
    -0.06
    POSITIVE LOGITS
     body
    0.06
    .groupControl
    0.06
     loss
    0.06
     helm
    0.06
    .group
    0.06
    vailable
    0.06
     familia
    0.06
     الموقع
    0.06
     passive
    0.06
     NL
    0.06
    Act Density 0.045%

    No Known Activations