INDEX
    Explanations

    mentions of gender, particularly focusing on males and their descriptions

    New Auto-Interp
    Negative Logits
    PhysRevLett
    -0.49
     lanka
    -0.49
     karna
    -0.48
    ntos
    -0.47
    ConverterFactory
    -0.47
     kuli
    -0.47
    resizingMask
    -0.46
    іга
    -0.46
     kuf
    -0.45
     adipis
    -0.45
    POSITIVE LOGITS
     male
    1.18
    Male
    1.09
     Male
    1.08
    male
    1.03
     MALE
    1.01
     males
    0.97
    Males
    0.92
     Males
    0.89
     témoignage
    0.87
     actionTypes
    0.84
    Act Density 0.052%

    No Known Activations