INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    IOS
    -0.08
     admittedly
    -0.08
     spoke
    -0.07
     impress
    -0.07
     decept
    -0.07
    VA
    -0.07
     CDs
    -0.07
    Kos
    -0.07
     напряж
    -0.07
     கூட
    -0.07
    POSITIVE LOGITS
     esimerkiksi
    0.10
     উচিত
    0.09
     trovare
    0.09
     richtige
    0.08
     safer
    0.08
    、更
    0.08
     Bunun
    0.08
    もっと
    0.08
    ўся
    0.08
     বিষয়
    0.08
    Act Density 0.045%

    No Known Activations