INDEX
    Explanations

    negative phrases or expressions of doubt

    Negation before "surprise" or similar words

    not surprising / no surprise

    New Auto-Interp
    Negative Logits
     للمعارف
    -0.67
    mobileqq
    -0.65
    Tikang
    -0.63
     initComponents
    -0.62
    !*\
    -0.60
    HtmlAttribute
    -0.59
     autorytatywna
    -0.59
    __':
    
    -0.58
    اشتی
    -0.57
    writeValue
    -0.57
    POSITIVE LOGITS
     surprise
    1.45
    surprise
    1.17
     surprised
    1.10
     Surprise
    1.07
     surprises
    1.06
     surprising
    1.05
     shock
    1.02
     wonder
    1.02
     surpris
    1.00
    Surprise
    1.00
    Act Density 0.171%

    No Known Activations