INDEX
    Explanations

    pronouns that denote gender and quantify representation

    New Auto-Interp
    Negative Logits
     member
    -0.18
    riter
    -0.16
    uben
    -0.15
    endir
    -0.15
    ennes
    -0.15
     mj
    -0.15
    adera
    -0.15
    emme
    -0.15
    Ŀ
    -0.15
    zend
    -0.15
    POSITIVE LOGITS
     counterparts
    0.27
     peers
    0.25
    counter
    0.19
     contempor
    0.18
    -counter
    0.18
     cohorts
    0.17
     challeng
    0.17
     peer
    0.17
     colleagues
    0.17
     fellows
    0.16
    Act Density 0.067%

    No Known Activations