INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -being
    -0.07
     attitude
    -0.07
     stride
    -0.06
     Challenges
    -0.06
    คราม
    -0.06
     duplicates
    -0.06
    _sites
    -0.06
    -cli
    -0.06
     thọ
    -0.06
     bigot
    -0.06
    POSITIVE LOGITS
    Finding
    0.07
     تولید
    0.07
     oček
    0.07
     Jury
    0.06
     Exhibit
    0.06
     IOError
    0.06
    ntag
    0.06
    0.06
     Multip
    0.06
    ''"
    0.06
    Act Density 0.191%

    No Known Activations