INDEX
    Explanations

    negative terms or insults

    derogatory terms and insults directed at individuals or groups

    New Auto-Interp
    Negative Logits
    tnc
    -0.76
    ondo
    -0.74
    RH
    -0.74
    ira
    -0.73
    winner
    -0.72
    Recomm
    -0.70
    APH
    -0.69
    ranging
    -0.69
    ANC
    -0.68
    iture
    -0.68
    POSITIVE LOGITS
     idiots
    1.03
     idiot
    1.02
     bastard
    0.98
     bully
    0.97
     spew
    0.95
     hypoc
    0.91
     bitch
    0.90
     asshole
    0.89
     sucker
    0.88
     crap
    0.86
    Act Density 0.072%

    No Known Activations