INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    이다
    1.27
    لي
    1.11
    ยัง
    1.11
     endere
    1.09
    N
    1.09
    1.08
    يا
    1.07
    1.07
    }*/
    1.05
    1.05
    POSITIVE LOGITS
    ,
    1.80
    i
    1.31
    an
    1.28
    u
    1.28
    a
    1.27
    ;
    1.23
    ur
    1.22
    ،
    1.21
    er
    1.20
     i
    1.16
    Act Density 0.000%

    No Known Activations