INDEX
    Explanations

    phrases indicating lack of surprise

    New Auto-Interp
    Negative Logits
    abouts
    -0.84
    abases
    -0.67
    tein
    -0.64
    inet
    -0.63
    heastern
    -0.62
    existent
    -0.62
    nels
    -0.61
    Administ
    -0.60
    peer
    -0.59
    nsic
    -0.58
    POSITIVE LOGITS
     surprise
    1.25
     shock
    0.96
     surprises
    0.94
     Surprise
    0.91
     surpr
    0.90
     unexpected
    0.87
     surprising
    0.86
     disappointment
    0.81
    shock
    0.81
     revelation
    0.79
    Act Density 0.084%

    No Known Activations