INDEX
    Explanations

    phrases indicating lack of surprise

    phrases indicating surprise or unexpectedness

    New Auto-Interp
    Negative Logits
    tein
    -0.67
    ouf
    -0.65
     Respond
    -0.64
    abases
    -0.63
     Role
    -0.63
    chens
    -0.62
    eday
    -0.62
    abouts
    -0.62
     href
    -0.60
    href
    -0.60
    POSITIVE LOGITS
     surprise
    1.46
     shock
    1.26
     surpr
    1.06
    shock
    1.02
     Surprise
    0.97
     disappointment
    0.96
     relief
    0.93
     revelation
    0.92
     surprises
    0.91
     surprising
    0.88
    Act Density 0.072%

    No Known Activations