INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    2.38
    i
    2.17
    2.08
    ه
    2.03
    ر
    1.87
    ి
    1.66
    ל
    1.66
    ational
    1.61
    inds
    1.61
    1.60
    POSITIVE LOGITS
    ción
    1.96
     albo
    1.80
    ח
    1.77
     académ
    1.75
    পূর্ণ
    1.65
     изготов
    1.64
     métallique
    1.63
     randomIndex
    1.57
    1.56
    çı
    1.54
    Act Density 0.167%

    No Known Activations