INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <Item
    -0.07
     قالب
    -0.07
     císa
    -0.07
     StringIO
    -0.06
    FSIZE
    -0.06
     Oper
    -0.06
     appetite
    -0.06
     бактер
    -0.06
    spiracy
    -0.06
     RCA
    -0.06
    POSITIVE LOGITS
    0.06
     biomedical
    0.06
     directed
    0.06
    리로
    0.06
    fight
    0.06
    ilon
    0.06
     hiring
    0.06
    0.06
    .logic
    0.06
    """
    ↵
    ↵
    0.06
    Act Density 0.016%

    No Known Activations