INDEX
    Explanations

    words related to emotions, societal attitudes, and power dynamics

    themes related to emotions, power dynamics, and societal issues

    New Auto-Interp
    Negative Logits
    arnaev
    -0.82
     crisp
    -0.68
    guyen
    -0.67
    zl
    -0.67
    olulu
    -0.64
    nces
    -0.63
    pload
    -0.63
    alez
    -0.61
    iannopoulos
    -0.59
    illion
    -0.59
    POSITIVE LOGITS
    lessness
    0.94
    smanship
    0.78
    iveness
    0.76
    anasia
    0.76
    thood
    0.76
    fulness
    0.73
    ism
    0.70
    manship
    0.69
    iness
    0.68
    liness
    0.66
    Act Density 0.378%

    No Known Activations