INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    talk
    -0.07
    ارت
    -0.06
     epidemic
    -0.06
    duplicate
    -0.06
    ùa
    -0.06
    就算
    -0.06
    .spacing
    -0.06
     talk
    -0.06
     Put
    -0.06
     acidic
    -0.06
    POSITIVE LOGITS
    (parseInt
    0.07
     mimo
    0.07
     standing
    0.06
     szer
    0.06
     descon
    0.06
    0.06
    ини
    0.06
    _ERRORS
    0.06
     threatening
    0.06
     Twice
    0.06
    Act Density 0.021%

    No Known Activations