INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    t
    1.80
    ک
    1.70
    el
    1.67
    se
    1.63
    ون
    1.61
    la
    1.55
    v
    1.52
    ن
    1.52
     I
    1.50
    re
    1.46
    POSITIVE LOGITS
    1.37
    1.22
    కు
    1.19
    ική
    1.18
    ного
    1.12
     compartilh
    1.11
    是对
    1.07
    رويج
    1.06
    1.06
    خدام
    1.05
    Act Density 0.000%

    No Known Activations