INDEX
    Explanations

    references to male and female pronouns in various contexts

    New Auto-Interp
    Negative Logits
     فريبيس
    -1.15
    énario
    -0.74
     “
    -0.67
    ंदीखरीदारी
    -0.67
     חיצוניים
    -0.66
    Välislingid
    -0.66
     unknownFields
    -0.65
     der
    -0.65
    GHIJKLM
    -0.64
     Normdatei
    -0.64
    POSITIVE LOGITS
    He
    1.43
    She
    1.30
     itſelf
    1.29
     Diſ
    1.24
     Cæsar
    1.20
     Anſ
    1.18
     Monfieur
    1.17
     Efq
    1.16
     myſelf
    1.15
     Reſ
    1.14
    Act Density 0.095%

    No Known Activations