INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     active
    -0.07
    ्रभ
    -0.07
     Scar
    -0.07
    _el
    -0.06
    oles
    -0.06
    <Item
    -0.06
    ائه
    -0.06
     SCORE
    -0.06
     educ
    -0.06
    -0.06
    POSITIVE LOGITS
     reviewer
    0.07
     XBOOLE
    0.07
     کتاب
    0.07
     imshow
    0.06
    _png
    0.06
    /var
    0.06
    WG
    0.06
    }).
    0.06
     -(
    0.06
     шаг
    0.06
    Act Density 0.082%

    No Known Activations