INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     descubrió
    -1.02
    tainable
    -0.94
     from
    -0.94
     for
    -0.91
     ilustracji
    -0.90
    Belum
    -0.87
     bebas
    -0.87
    ρών
    -0.85
    вної
    -0.85
    クシー
    -0.85
    POSITIVE LOGITS
     successfully
    2.73
     successful
    2.22
     Successfully
    2.09
    Successfully
    2.06
     success
    2.00
     sucess
    1.85
     succes
    1.84
    successfully
    1.77
     успешно
    1.74
     SUCCESS
    1.68
    Act Density 0.011%

    No Known Activations