INDEX
    Explanations

    references to significant women in history or prominent female figures

    following commas referring to females

    female royalty and historical figures

    New Auto-Interp
    Negative Logits
     himself
    -1.14
    himself
    -0.99
     seines
    -0.84
     brotherhood
    -0.81
    łbym
    -0.81
     Himself
    -0.81
    وفاته
    -0.80
     his
    -0.80
    彼は
    -0.79
     boyhood
    -0.79
    POSITIVE LOGITS
     herself
    2.07
     her
    1.64
     she
    1.57
    herself
    1.56
     그녀
    1.21
     její
    1.18
     hennes
    1.18
     shes
    1.18
     haar
    1.16
    彼女は
    1.12
    Act Density 1.966%

    No Known Activations