INDEX
    Explanations

    words related to feelings of contempt or disrespect towards authority or specific groups

    New Auto-Interp
    Negative Logits
    ramid
    -0.69
     hemor
    -0.68
     Lans
    -0.65
    NetMessage
    -0.64
     encyclopedia
    -0.63
     toget
    -0.61
    akeru
    -0.60
     reconstruction
    -0.60
     stabilization
    -0.60
     advoc
    -0.60
    POSITIVE LOGITS
    uously
    1.55
    uous
    1.52
    fully
    1.20
    ible
    1.18
    ibly
    1.09
    ful
    1.04
    ateurs
    1.02
    ardless
    1.01
    urous
    1.00
    orable
    0.99
    Act Density 0.041%

    No Known Activations