INDEX
    Explanations

    instances of expressing surprise about something being observed for the first time

    expressions of novelty or unique experiences

    New Auto-Interp
    Negative Logits
    externalActionCode
    -0.68
    ials
    -0.66
    recy
    -0.61
    ayers
    -0.59
    Shape
    -0.58
    bi
    -0.56
    Charge
    -0.56
    roup
    -0.55
    uli
    -0.55
     refin
    -0.55
    POSITIVE LOGITS
     anything
    0.93
     anybody
    0.92
     anyone
    0.84
    anything
    0.77
     anywhere
    0.76
     dime
    0.74
     ANY
    0.74
     nor
    0.69
     bothered
    0.68
     any
    0.67
    Act Density 0.069%

    No Known Activations