INDEX
    Explanations

    connotations of condescension and dehumanization

    New Auto-Interp
    Negative Logits
    Reviewer
    -0.83
    ansas
    -0.77
    «ĺ
    -0.77
    FORE
    -0.73
    STD
    -0.66
    IRO
    -0.66
     Gors
    -0.66
    lly
    -0.65
     Spit
    -0.64
    instein
    -0.63
    POSITIVE LOGITS
    asking
    1.20
    ension
    1.02
    ensions
    1.01
    essential
    0.90
    ouch
    0.88
     multit
    0.86
    itude
    0.83
    agic
    0.82
    ributes
    0.82
    asks
    0.82
    Act Density 0.006%

    No Known Activations