INDEX
    Explanations

    phrases related to surprise or revelation

    statements related to surprising or unexpected information and their phrasing

    New Auto-Interp
    Negative Logits
    ascript
    -0.68
    ourses
    -0.66
    sequently
    -0.65
    ilial
    -0.64
    rans
    -0.62
     [+
    -0.61
    [_
    -0.61
    azel
    -0.61
    Applications
    -0.58
    emort
    -0.57
    POSITIVE LOGITS
     kidding
    0.89
     nerds
    0.77
     damned
    0.75
     darn
    0.75
     understatement
    0.75
     Cheap
    0.73
     steroids
    0.72
     hats
    0.71
     Dirty
    0.68
     classy
    0.68
    Act Density 1.991%

    No Known Activations