INDEX
    Explanations

    phrases related to societal expectations, especially regarding appearance and gender

    attitudes and beliefs about gender and physical appearance

    New Auto-Interp
    Negative Logits
     Ples
    -0.95
    cussion
    -0.77
     resumed
    -0.76
    glers
    -0.75
     Meteor
    -0.71
    clinton
    -0.70
     seism
    -0.69
     ransomware
    -0.68
    EStream
    -0.68
    laun
    -0.68
    POSITIVE LOGITS
     superiority
    1.28
     masculinity
    1.27
     individuality
    1.22
     attractiveness
    1.21
     feminine
    1.17
     femin
    1.17
     uniqueness
    1.15
     masculine
    1.13
    worthiness
    1.10
     inferior
    1.10
    Act Density 0.596%

    No Known Activations