INDEX
    Explanations

    phrases related to expressing concern or worry

    instances of the word "concerned."

    New Auto-Interp
    Negative Logits
    artifacts
    -0.76
     Bom
    -0.73
    avorite
    -0.69
    arb
    -0.68
    ingers
    -0.67
    ingen
    -0.67
    lite
    -0.65
    ety
    -0.65
    sword
    -0.64
    obs
    -0.64
    POSITIVE LOGITS
     trolling
    0.78
    reon
    0.76
    lessly
    0.73
    ingly
    0.71
    ienced
    0.69
    wart
    0.69
    atives
    0.69
    cerned
    0.68
    iversal
    0.68
    edly
    0.67
    Act Density 0.029%

    No Known Activations