INDEX
    Explanations

    negative adjectives and insults

    derogatory terms aimed at individuals and their characteristics

    New Auto-Interp
    Negative Logits
    iscover
    -0.71
     ideally
    -0.69
     outdoors
    -0.68
     Enc
    -0.67
     incorporating
    -0.67
    arger
    -0.65
     incorporate
    -0.65
     Fold
    -0.63
     consulted
    -0.63
     conserv
    -0.63
    POSITIVE LOGITS
     pathetic
    2.20
     worthless
    1.89
     idiots
    1.89
     stupidity
    1.89
     hypocritical
    1.86
     meaningless
    1.83
     bullshit
    1.83
     laughable
    1.82
     pointless
    1.81
     shitty
    1.79
    Act Density 0.090%

    No Known Activations