INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     thoroughly
    -0.08
     البن
    -0.07
    ωμα
    -0.07
     Série
    -0.07
    ذي
    -0.07
     vole
    -0.07
    \ORM
    -0.07
     Goog
    -0.07
    uron
    -0.07
    etail
    -0.07
    POSITIVE LOGITS
     sustent
    0.08
     sustain
    0.07
     certify
    0.07
     مزد
    0.07
     duas
    0.07
     قبول
    0.07
    caught
    0.07
     cong
    0.07
    xfe
    0.07
     পার
    0.07
    Act Density 0.000%

    No Known Activations