INDEX
    Explanations

    actions or outcomes that have a significant impact or consequences

    action words that indicate declarations or statements of fact

    New Auto-Interp
    Negative Logits
    sw
    -0.70
    away
    -0.69
    aneous
    -0.68
    nex
    -0.64
    pton
    -0.61
    squ
    -0.59
    tex
    -0.59
    xxx
    -0.59
    yon
    -0.59
    isp
    -0.59
    POSITIVE LOGITS
    ometimes
    1.05
    ilver
    0.94
    ensibly
    0.82
    hift
    0.81
    omething
    0.79
    hirt
    0.77
    everal
    0.72
    olate
    0.71
    rals
    0.69
    uggest
    0.68
    Act Density 0.467%

    No Known Activations