INDEX
    Explanations

    terms related to negative conditions or outcomes

    New Auto-Interp
    Negative Logits
    rosso
    -0.18
    esub
    -0.16
    uvo
    -0.15
    immel
    -0.14
    rellas
    -0.14
    धर
    -0.14
    etas
    -0.14
    959
    -0.14
    शन
    -0.13
    anza
    -0.13
    POSITIVE LOGITS
     depending
    1.03
    depending
    0.93
     Depending
    0.63
     depends
    0.63
    Depending
    0.59
     depend
    0.57
    depends
    0.56
     Depends
    0.55
     depended
    0.54
     tùy
    0.51
    Act Density 0.350%

    No Known Activations