INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _ag
    -0.06
     turmoil
    -0.06
     hour
    -0.06
     semen
    -0.06
     casc
    -0.06
    blue
    -0.06
     floral
    -0.06
     pharmacist
    -0.06
     xn
    -0.06
     конца
    -0.06
    POSITIVE LOGITS
     worthy
    0.31
    orthy
    0.15
    -worthy
    0.13
    worthy
    0.12
     deserving
    0.12
     needy
    0.09
    eworthy
    0.09
    TED
    0.07
     luyện
    0.07
     Derm
    0.07
    Act Density 0.005%

    No Known Activations