INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     melhorar
    0.36
     সত
    0.36
     सीएचएसएल
    0.36
     wst
    0.36
     Pell
    0.35
     deficiencies
    0.35
     להי
    0.35
     erreur
    0.35
     bienvenidas
    0.35
    0.35
    POSITIVE LOGITS
     control
    2.56
    control
    2.33
    控制
    2.30
    Control
    2.22
     kontrol
    2.17
     контроля
    2.17
     контроль
    2.17
     Kontrolle
    2.17
     Control
    2.14
     controllo
    2.14
    Act Density 0.081%

    No Known Activations