INDEX
    Explanations

    phrases related to equality and fairness in treatment

    New Auto-Interp
    Negative Logits
    ьаж
    -0.56
    bootstrapcdn
    -0.46
    лтемелер
    -0.46
     ویکی‌پدیای
    -0.45
    balleur
    -0.45
     Habits
    -0.44
    AndEndTag
    -0.44
    Qual
    -0.43
     виправивши
    -0.43
    NPS
    -0.43
    POSITIVE LOGITS
     fairness
    1.34
     unfair
    1.20
     inequ
    1.09
     Fairness
    1.06
     injustice
    1.02
     inequality
    1.02
     fairer
    1.01
     justice
    1.00
     inequalities
    0.99
     unequal
    0.99
    Act Density 0.730%

    No Known Activations