INDEX
    Explanations

    terms related to discrimination and civil rights violations

    New Auto-Interp
    Negative Logits
    lify
    -0.17
    aho
    -0.15
    ift
    -0.15
    oram
    -0.15
    ilda
    -0.14
    cn
    -0.14
    _SIGNATURE
    -0.14
    .ManyToMany
    -0.14
    ough
    -0.14
    esty
    -0.14
    POSITIVE LOGITS
     against
    0.18
    against
    0.18
    Against
    0.18
     Against
    0.15
     taste
    0.15
    zew
    0.15
    rzy
    0.15
     towards
    0.14
     hiring
    0.14
    ellen
    0.14
    Act Density 0.028%

    No Known Activations