INDEX
    Explanations

    expressions of frustration and criticism towards political figures or situations

    Profanity after articles or pronouns

    swear words and insults

    New Auto-Interp
    Negative Logits
    "),
    
    -0.61
    ]='\
    -0.59
    ,))
    -0.56
    _
    
    -0.55
    Искәрмәләр
    -0.55
    ]-'
    -0.54
     tph
    -0.52
    !="")
    -0.51
    uxxxx
    -0.50
    */}
    -0.50
    POSITIVE LOGITS
     fuck
    2.20
     shit
    2.09
    fuck
    1.98
     fucking
    1.95
     damn
    1.91
     Fuck
    1.91
     damned
    1.88
    Fuck
    1.84
     FUCK
    1.82
     fucked
    1.79
    Act Density 0.330%

    No Known Activations