INDEX
    Explanations

    instances of the word "surprise" in various forms and contexts

    New Auto-Interp
    Negative Logits
    oble
    -0.18
    ixels
    -0.17
    edly
    -0.17
    oldt
    -0.16
    oke
    -0.16
    bare
    -0.16
    esium
    -0.15
    ertia
    -0.15
    mouth
    -0.15
    ed
    -0.15
    POSITIVE LOGITS
    prisingly
    0.27
    -sur
    0.21
    rounded
    0.21
    prising
    0.20
    charge
    0.20
    veillance
    0.19
    veys
    0.19
    prises
    0.19
    jective
    0.19
    rogate
    0.19
    Act Density 0.019%

    No Known Activations