INDEX
    Explanations

    derogatory and offensive language

    phrases expressing frustration or demands

    New Auto-Interp
    Negative Logits
    iHUD
    -0.76
    Located
    -0.73
     [|
    -0.70
    catentry
    -0.69
     srf
    -0.69
    ipment
    -0.69
    pleted
    -0.67
    Initial
    -0.67
    Location
    -0.66
     ¥
    -0.64
    POSITIVE LOGITS
     hypocrisy
    1.21
     feminists
    1.16
     liberals
    1.15
     hypocritical
    1.14
     feminism
    1.05
     leftists
    1.03
     libertarians
    1.00
     hypocr
    0.99
     Chomsky
    0.98
     Orwell
    0.95
    Act Density 1.556%

    No Known Activations