INDEX
    Explanations

    noun phrases indicating outcomes or consequences

    phrases that indicate causality or outcomes

    New Auto-Interp
    Negative Logits
     snipp
    -0.71
    afort
    -0.68
     redes
    -0.63
    jug
    -0.62
    zan
    -0.59
     suspic
    -0.56
     lapt
    -0.55
    flying
    -0.55
     Fired
    -0.54
    hello
    -0.54
    POSITIVE LOGITS
     thereof
    1.08
     of
    1.02
     result
    0.78
    result
    0.74
    OF
    0.74
    ainer
    0.73
    Of
    0.68
    of
    0.66
    uating
    0.65
    liest
    0.65
    Act Density 0.061%

    No Known Activations