INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kadar
    -0.06
     cylinder
    -0.06
    /.
    -0.06
     tand
    -0.06
     rear
    -0.06
     verst
    -0.06
     soaring
    -0.06
     exe
    -0.06
     tham
    -0.06
    haf
    -0.06
    POSITIVE LOGITS
    (predictions
    0.07
     Больш
    0.07
     Cut
    0.07
    laughter
    0.07
    xbd
    0.06
    αρα
    0.06
     زیاد
    0.06
    PCS
    0.06
    %"><
    0.06
    атора
    0.06
    Act Density 0.001%

    No Known Activations