INDEX
    Explanations

    references to female gender

    New Auto-Interp
    Negative Logits
    lut
    -0.38
    ann
    -0.37
    Übersicht
    -0.36
    ery
    -0.36
    Rollback
    -0.36
    iter
    -0.35
    kt
    -0.35
    relse
    -0.34
    nos
    -0.33
     spol
    -0.33
    POSITIVE LOGITS
    Female
    1.05
     Female
    1.05
     female
    1.03
    female
    0.99
     FEMALE
    0.87
     femenina
    0.86
     femeninos
    0.86
     kvinna
    0.86
     woman
    0.85
    FEMALE
    0.83
    Act Density 0.169%

    No Known Activations