INDEX
    Explanations

    pronouns referring to male individuals

    New Auto-Interp
    Negative Logits
    iland
    -0.15
    edii
    -0.14
    ë³ij
    -0.14
    ((__
    -0.14
    uego
    -0.14
    Äĥm
    -0.13
    -opacity
    -0.13
    rupted
    -0.13
    oggle
    -0.13
     ZemÄĽ
    -0.13
    POSITIVE LOGITS
     or
    0.19
    /her
    0.15
    ados
    0.15
    rello
    0.14
    idi
    0.14
    zan
    0.14
    andise
    0.14
     golf
    0.14
     Bek
    0.14
    olen
    0.14
    Act Density 0.152%

    No Known Activations