INDEX
    Explanations

    references to different personal pronouns, particularly in the context of gender

    New Auto-Interp
    Negative Logits
     Privacidade
    -0.71
    पया
    -0.71
    esModule
    -0.69
    ContentAsync
    -0.67
    Datuak
    -0.65
     Picchu
    -0.64
    ScopeManager
    -0.63
    étaire
    -0.63
    RenderAtEndOf
    -0.62
    JNIEnv
    -0.62
    POSITIVE LOGITS
     He
    0.72
     he
    0.70
     She
    0.69
    He
    0.66
    ("")]
    
    0.64
     his
    0.60
     Her
    0.60
     His
    0.58
    }}^{(
    0.57
    She
    0.57
    Act Density 0.307%

    No Known Activations