INDEX
    Explanations

    references to female characters and their actions or attributes

    pronouns followed by actions or outcomes

    New Auto-Interp
    Negative Logits
    +#+#
    -0.82
    Personensuche
    -0.77
    出版年
    -0.76
     الرياضيه
    -0.66
     IBOutlet
    -0.63
    сылкі
    -0.63
    CppMethod
    -0.62
    Hentet
    -0.62
    UserScript
    -0.59
    Tembelea
    -0.59
    POSITIVE LOGITS
     Rojas
    0.38
     him
    0.38
    zusch
    0.35
     Meny
    0.35
     éste
    0.35
     Community
    0.35
     Sem
    0.34
     Económica
    0.33
     reformed
    0.33
    看他
    0.33
    Act Density 0.290%

    No Known Activations