INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ילות
    -2.42
     lidé
    -2.39
     hábiles
    -2.36
     respostas
    -2.34
    ntk
    -2.33
     inept
    -2.22
    דש
    -2.20
     invitar
    -2.19
     sám
    -2.17
     dichas
    -2.13
    POSITIVE LOGITS
     While
    2.69
    (?)
    2.64
    C
    2.61
    E
    2.34
    st
    2.23
    is
    2.23
     jose
    2.22
    Однако
    2.19
     fritt
    2.19
    However
    2.16
    Act Density 0.004%

    No Known Activations