INDEX
    Explanations

    references to women in various contexts

    New Auto-Interp
    Negative Logits
    ustimmung
    -0.70
    ScopeManager
    -0.67
     Boulogne
    -0.65
    Mille
    -0.65
     Naidu
    -0.64
    enumi
    -0.63
     RIAA
    -0.62
     emailAlready
    -0.62
     Combien
    -0.61
     charlotte
    -0.60
    POSITIVE LOGITS
    s
    0.78
     womens
    0.70
     mens
    0.68
    omens
    0.67
     Mens
    0.65
    childrens
    0.62
     Womens
    0.60
    iseks
    0.59
    womens
    0.59
    AddTagHelper
    0.55
    Act Density 0.113%

    No Known Activations