INDEX
    Explanations

    phrases related to worry or concerns

    concerning statements or expressions of worry

    New Auto-Interp
    Negative Logits
    Guard
    -0.71
    stead
    -0.62
    osures
    -0.61
    Laughs
    -0.60
     Himself
    -0.60
    orah
    -0.59
    wn
    -0.59
    EMBER
    -0.58
    Drag
    -0.58
    shit
    -0.58
    POSITIVE LOGITS
     misunder
    0.66
    cher
    0.65
     they
    0.64
     provoked
    0.62
     contradicts
    0.61
    76561
    0.61
     someday
    0.61
     inexper
    0.58
    eday
    0.57
     underest
    0.57
    Act Density 0.227%

    No Known Activations