INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ي
    1.20
    1.16
     Idee
    1.15
     svou
    1.10
    ן
    1.10
    1.09
     jeste
    1.06
     مؤرشف
    1.05
    ال
    1.05
    ల్
    1.04
    POSITIVE LOGITS
    al
    1.43
    n
    1.21
    do
    1.11
    ્સ
    1.03
     runny
    1.03
    1.03
    ب
    1.00
    ding
    0.98
    er
    0.97
    tting
    0.97
    Act Density 0.000%

    No Known Activations