INDEX
    Explanations

    emotion-related words, such as "excitement," "sadness," and "arrogance."

    emotions and negative experiences

    New Auto-Interp
    Negative Logits
    REE
    -0.65
    Activity
    -0.64
    ppel
    -0.63
    onal
    -0.62
    ãĥīãĥ©
    -0.61
    LAND
    -0.61
    Go
    -0.61
    asketball
    -0.60
    WD
    -0.59
    ĵ
    -0.59
    POSITIVE LOGITS
    iest
    1.25
     inherent
    1.11
    iness
    0.94
     emanating
    0.94
     afforded
    0.94
     surrounding
    0.93
     plag
    0.86
    lessness
    0.85
     quot
    0.83
     aspect
    0.82
    Act Density 0.296%

    No Known Activations