INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     humility
    -0.07
     prescribing
    -0.06
     Mot
    -0.06
     educating
    -0.06
     llevar
    -0.06
     превыш
    -0.06
    smooth
    -0.06
    -0.06
    -0.06
    -0.06
    POSITIVE LOGITS
    xAB
    0.07
    ила
    0.07
     Mour
    0.07
    (team
    0.07
    (cnt
    0.07
    :s
    0.06
    odel
    0.06
    )arg
    0.06
     telephone
    0.06
    Up
    0.06
    Act Density 0.000%

    No Known Activations