INDEX
    Explanations

    words related to condemning or showing disapproval

    New Auto-Interp
    Negative Logits
    ramid
    -0.76
    IER
    -0.69
    emis
    -0.68
    lio
    -0.67
    neau
    -0.66
    BLIC
    -0.65
     membr
    -0.65
    OTE
    -0.64
    NetMessage
    -0.64
    seed
    -0.63
    POSITIVE LOGITS
     condemn
    0.97
     condemning
    0.84
     homophobic
    0.82
     harshly
    0.81
     racism
    0.81
     unequivocally
    0.80
     unres
    0.79
     condemnation
    0.78
    ations
    0.77
     atrocities
    0.77
    Act Density 0.044%

    No Known Activations