INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ##
    -0.08
     onFocus
    -0.07
     בחדר
    -0.07
     handleClose
    -0.07
     condem
    -0.07
    ccione
    -0.07
    tür
    -0.07
    -0.06
    .icon
    -0.06
    Nam
    -0.06
    POSITIVE LOGITS
     presentation
    0.07
    ailability
    0.07
     clay
    0.07
     millennials
    0.06
    执法
    0.06
    七星
    0.06
    merged
    0.06
    اف
    0.06
    0.06
     womens
    0.06
    Act Density 0.086%

    No Known Activations