INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    irling
    -0.09
    uale
    -0.09
     odlič
    -0.09
    olini
    -0.08
    overall
    -0.08
     parado
    -0.08
    veni
    -0.08
     oggi
    -0.08
    ovne
    -0.08
    -0.08
    POSITIVE LOGITS
     complaining
    0.09
     sniff
    0.09
    (Java
    0.08
     مربوط
    0.08
    0.08
    Used
    0.08
     kindly
    0.08
    FID
    0.08
     Dub
    0.07
     JU
    0.07
    Act Density 0.123%

    No Known Activations