INDEX
    Explanations

    expressions indicating surprise or unexpected outcomes

    phrases indicating the expectation of surprise or the state of being taken aback

    New Auto-Interp
    Negative Logits
    ebted
    -0.71
     Logged
    -0.69
    neys
    -0.61
    oller
    -0.58
    ney
    -0.58
    them
    -0.57
    ter
    -0.56
     Attributes
    -0.55
    tery
    -0.55
     filibuster
    -0.54
    POSITIVE LOGITS
     unsur
    0.81
     no
    0.72
    pires
    0.72
     follows
    0.70
     surprise
    0.68
    pired
    0.68
     shock
    0.68
    ynchron
    0.67
     unwelcome
    0.67
     far
    0.67
    Act Density 0.039%

    No Known Activations