INDEX
    Explanations

    gender-specific pronouns and references to female characters

    New Auto-Interp
    Negative Logits
    hani
    -0.18
    hausen
    -0.15
    dorf
    -0.15
    ibraltar
    -0.15
    endi
    -0.14
     Affero
    -0.14
    šet
    -0.14
    kle
    -0.14
    chner
    -0.14
    986
    -0.14
    POSITIVE LOGITS
    anship
    0.17
    vsp
    0.16
     Mes
    0.15
     {[
    0.14
    ulp
    0.14
    ior
    0.13
    FK
    0.13
    sak
    0.13
    à¥ĥत
    0.13
    .Navigation
    0.13
    Act Density 0.417%

    No Known Activations