INDEX
    Explanations

    phrases indicating strong emotions or opinions, often in a negative context

    instances of rants and violent outbursts

    New Auto-Interp
    Negative Logits
    undai
    -0.88
    metics
    -0.86
    tarians
    -0.75
    ierrez
    -0.72
    rity
    -0.71
    cius
    -0.71
    liv
    -0.68
     Orchestra
    -0.66
     Assistance
    -0.66
    tarian
    -0.65
    POSITIVE LOGITS
     tir
    0.86
     vengeance
    0.84
    atical
    0.82
     against
    0.78
    quit
    0.76
     spree
    0.74
     rampage
    0.73
     AGA
    0.73
    ¯¯¯¯
    0.73
     rage
    0.71
    Act Density 0.087%

    No Known Activations