INDEX
    Explanations

    instances of systemic bias, particularly in the context of gender inequality and institutional practices

    New Auto-Interp
    Negative Logits
    747
    -0.14
    ogan
    -0.14
    anders
    -0.14
    921
    -0.14
    ÑĤÑĢон
    -0.13
    raud
    -0.13
    onga
    -0.13
    920
    -0.13
    leigh
    -0.12
    ias
    -0.12
    POSITIVE LOGITS
     across
    0.39
     everywhere
    0.31
     both
    0.31
     Across
    0.27
    Across
    0.26
     wherever
    0.26
    both
    0.25
     throughout
    0.24
     ranging
    0.23
     både
    0.23
    Act Density 0.358%

    No Known Activations