INDEX
    Explanations

    terms related to gender identity and gender equality

    New Auto-Interp
    Negative Logits
    yan
    -0.17
    ãĥ«ãĥĪ
    -0.16
    yun
    -0.15
    eltas
    -0.14
    yas
    -0.14
    vals
    -0.14
    ivr
    -0.14
    lashes
    -0.14
    sie
    -0.14
    deo
    -0.14
    POSITIVE LOGITS
    ed
    0.37
     roles
    0.25
    edn
    0.25
    que
    0.23
    less
    0.21
    fluid
    0.21
    -neutral
    0.21
     Roles
    0.21
    -role
    0.21
    edBy
    0.20
    Act Density 0.011%

    No Known Activations