INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     hats
    -0.07
     تف
    -0.07
    -0.06
     conectar
    -0.06
     promoter
    -0.06
    chair
    -0.06
     aplik
    -0.06
     bakery
    -0.06
     blacks
    -0.06
     pItem
    -0.06
    POSITIVE LOGITS
    fn
    0.07
     теор
    0.06
    ///////////////////////////////////////////////////////////////////////////////↵
    0.06
     asıl
    0.06
     MAIL
    0.06
     ancient
    0.06
    ########################
    0.06
    vault
    0.06
     Ар
    0.06
    ALLED
    0.06
    Act Density 0.001%

    No Known Activations