INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     becomes
    -0.07
     bigotry
    -0.06
     créer
    -0.06
     pregnancy
    -0.06
    انات
    -0.06
     endeavour
    -0.06
     xem
    -0.06
     rushes
    -0.06
    Ass
    -0.06
    Watch
    -0.06
    POSITIVE LOGITS
     élect
    0.07
    natal
    0.07
    filesize
    0.06
    0.06
    říd
    0.06
    stile
    0.06
    esModule
    0.06
     UIApplication
    0.06
    ائی
    0.06
     imkan
    0.06
    Act Density 0.020%

    No Known Activations