INDEX
    Explanations

    phrases and words that describe disparities, inequalities, or imbalances

    instances of disparity or disproportionate impact on various groups or issues

    New Auto-Interp
    Negative Logits
    uring
    -0.80
    ince
    -0.74
    erm
    -0.73
    love
    -0.72
    adal
    -0.71
    shire
    -0.70
    ures
    -0.70
    zyme
    -0.69
    psons
    -0.69
    DCS
    -0.68
    POSITIVE LOGITS
     disproportionately
    0.93
     disproportion
    0.92
     disadvantage
    0.84
     disadvant
    0.79
     favoring
    0.78
     disenfranch
    0.77
     disadvantages
    0.74
     representation
    0.72
    aga
    0.70
     benefiting
    0.69
    Act Density 0.049%

    No Known Activations