INDEX
    Explanations

    words related to actions that imply judgment or evaluation

    New Auto-Interp
    Negative Logits
    anny
    -0.16
    èĢħãģ®
    -0.14
    prm
    -0.14
    amd
    -0.14
    gles
    -0.13
    fx
    -0.13
    Sher
    -0.13
    ÛĮÚ©
    -0.13
    aviest
    -0.13
    vt
    -0.13
    POSITIVE LOGITS
    ed
    1.90
    edBy
    1.05
    edb
    0.90
    edn
    0.87
    edl
    0.85
    edm
    0.72
    ED
    0.72
    edir
    0.67
    edata
    0.66
    edd
    0.63
    Act Density 0.424%

    No Known Activations