INDEX
    Explanations

    phrases related to reactions and consequences

    surprise or unexpected outcomes

    New Auto-Interp
    Negative Logits
     AttributeSet
    -0.60
     MainAxisSize
    -0.57
     houſe
    -0.57
     testimonials
    -0.55
     estekak
    -0.54
     purpoſe
    -0.53
     Houſe
    -0.53
     faſt
    -0.53
    Portale
    -0.49
     Anſ
    -0.49
    POSITIVE LOGITS
     surprise
    0.59
    Surprise
    0.57
     surprised
    0.55
     overras
    0.55
     Surprise
    0.54
     surpresa
    0.50
     überras
    0.48
     surprising
    0.48
     Überras
    0.47
    surprise
    0.45
    Act Density 0.175%

    No Known Activations