INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     baba
    -0.08
     happening
    -0.08
     удаления
    -0.07
     favorita
    -0.07
     bort
    -0.07
     favorito
    -0.07
    akur
    -0.07
     Idi
    -0.07
     Jug
    -0.07
    agli
    -0.07
    POSITIVE LOGITS
    .symmetric
    0.09
     Macau
    0.08
     moistur
    0.08
     בהתאם
    0.08
    0.08
     provision
    0.08
    EEDED
    0.08
    तान
    0.08
    确保
    0.07
     根据
    0.07
    Act Density 0.006%

    No Known Activations