INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tiền
    -0.07
     celui
    -0.06
    ouston
    -0.06
    phthalm
    -0.06
    _nsec
    -0.06
    CLUSION
    -0.06
     پیشنه
    -0.06
     accesses
    -0.06
     Cout
    -0.06
    _shutdown
    -0.06
    POSITIVE LOGITS
    oner
    0.08
     firm
    0.07
    0.07
    .Mon
    0.07
     Amir
    0.07
    irma
    0.07
    ER
    0.07
     succesfully
    0.07
    مع
    0.07
     came
    0.07
    Act Density 0.008%

    No Known Activations