INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Standard
    -0.07
    mic
    -0.07
     snowy
    -0.07
     tuyệt
    -0.07
     Dy
    -0.06
    (dic
    -0.06
    (val
    -0.06
     tie
    -0.06
    Dic
    -0.06
     mi
    -0.06
    POSITIVE LOGITS
     confront
    0.14
     confronted
    0.12
     confronting
    0.12
     confrontation
    0.12
    forcing
    0.07
    ngr
    0.07
    Atual
    0.07
    person
    0.07
    เพล
    0.07
    -alist
    0.07
    Act Density 0.005%

    No Known Activations