INDEX
    Explanations

    references to women and their achievements, particularly in leadership and historically significant roles

    New Auto-Interp
    Negative Logits
    iker
    -0.15
     Gut
    -0.15
    Separated
    -0.15
    acher
    -0.14
    iphy
    -0.14
    istar
    -0.14
    andon
    -0.14
    空
    -0.14
    urgeon
    -0.14
    -style
    -0.13
    POSITIVE LOGITS
    mutable
    0.15
    ê¶Į
    0.15
     Ness
    0.15
    ERRU
    0.14
    isco
    0.14
     enthusi
    0.14
    iat
    0.14
    Äįet
    0.14
    à¹Īาà¸ĩ
    0.14
    OKEN
    0.14
    Act Density 0.083%

    No Known Activations