INDEX
    Explanations

    occurrences of the word "surprise" and its variations

    New Auto-Interp
    Negative Logits
    une
    -0.18
    ajs
    -0.18
    mes
    -0.16
    isters
    -0.15
    casts
    -0.15
    unes
    -0.15
    ãĤįãģĨ
    -0.14
    enna
    -0.14
    मर
    -0.14
    à¸Ńà¸ļ
    -0.14
    POSITIVE LOGITS
    ingly
    0.31
     surprise
    0.24
    ably
    0.21
     surpr
    0.21
     surprises
    0.20
     Surprise
    0.20
     surprised
    0.20
    ively
    0.17
     unexpected
    0.17
    unexpected
    0.16
    Act Density 0.035%

    No Known Activations