INDEX
    Explanations

    themes related to negative experiences and emotions

    New Auto-Interp
    Negative Logits
     sumpay
    -0.55
     bilingual
    -0.53
     complementary
    -0.52
     zest
    -0.52
    WireFormatLite
    -0.52
     complements
    -0.51
    قق
    -0.51
     accomplishments
    -0.51
     egli
    -0.50
     pioneers
    -0.50
    POSITIVE LOGITS
     dangerous
    0.68
    😡
    0.64
     CWE
    0.64
     offending
    0.63
     attack
    0.61
    postsleuth
    0.59
    🤬
    0.58
    🤦
    0.58
    😨
    0.57
     malicious
    0.57
    Act Density 1.669%

    No Known Activations