INDEX
    Explanations

    references to fairness or equity in various contexts

    New Auto-Interp
    Negative Logits
    elho
    -0.17
    pheres
    -0.16
    owo
    -0.15
    obia
    -0.15
    eve
    -0.15
    ĤŃ
    -0.15
    endor
    -0.14
    gia
    -0.14
    sWith
    -0.14
    stag
    -0.14
    POSITIVE LOGITS
    yt
    0.39
    ground
    0.34
    weather
    0.33
    grounds
    0.33
    er
    0.28
    ies
    0.27
    fax
    0.27
    child
    0.26
    play
    0.25
    banks
    0.25
    Act Density 0.029%

    No Known Activations