INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Cls
    -0.07
    adoras
    -0.07
    ????????
    -0.07
    tabl
    -0.07
    _generate
    -0.07
     CURL
    -0.07
    byname
    -0.06
    rel
    -0.06
    Pragma
    -0.06
     Icon
    -0.06
    POSITIVE LOGITS
     الکتر
    0.06
    Additional
    0.06
    isten
    0.06
    ۳۶
    0.06
    sb
    0.06
    FB
    0.06
     Gluten
    0.06
    paragraph
    0.05
     vv
    0.05
     storia
    0.05
    Act Density 0.010%

    No Known Activations