INDEX
    Explanations

    sentences expressing surprise or astonishment

    phrases indicating surprise or astonishment

    New Auto-Interp
    Negative Logits
    alach
    -0.78
    enture
    -0.75
    reau
    -0.75
     condu
    -0.70
     conduit
    -0.68
    illes
    -0.67
    uti
    -0.67
    aim
    -0.65
     vend
    -0.61
     qui
    -0.60
    POSITIVE LOGITS
    DERR
    0.84
     surprise
    0.73
    IGHT
    0.72
    iry
    0.67
     how
    0.67
    ãĥ©ãĥ³
    0.62
     headlines
    0.61
    Appearance
    0.61
     seeing
    0.59
     [*
    0.59
    Act Density 0.210%

    No Known Activations