INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    //*[
    -0.08
     آذ
    -0.07
     facade
    -0.06
     deeds
    -0.06
    !"
    -0.06
    -0.06
    ccak
    -0.06
    ereco
    -0.06
    endir
    -0.06
     heeft
    -0.06
    POSITIVE LOGITS
    레벨
    0.07
     Moments
    0.07
     coupled
    0.07
    again
    0.07
     thân
    0.06
     Bonus
    0.06
     rapidly
    0.06
    kup
    0.06
    swire
    0.06
     Blogs
    0.06
    Act Density 0.041%

    No Known Activations