INDEX
    Explanations

    references to women and gender-related statistics

    New Auto-Interp
    Negative Logits
    eeper
    -0.17
     اÙĦÙĪ
    -0.16
    ÅĻiv
    -0.15
    ieux
    -0.15
    éo
    -0.14
     Rubin
    -0.14
    elsen
    -0.14
    pper
    -0.13
    ano
    -0.13
    atter
    -0.13
    POSITIVE LOGITS
    335
    0.18
    eman
    0.14
    roti
    0.14
    amu
    0.14
     tabs
    0.14
     PF
    0.14
    _kv
    0.14
    osten
    0.13
    ABS
    0.13
    oje
    0.13
    Act Density 0.537%

    No Known Activations