INDEX
    Explanations

    references to men and masculinity in various contexts

    New Auto-Interp
    Negative Logits
    dale
    -0.17
    allon
    -0.17
     AuthenticationService
    -0.15
    ture
    -0.15
    dge
    -0.15
    enticate
    -0.15
    gaard
    -0.15
     DAG
    -0.15
    er
    -0.14
    ssel
    -0.14
    POSITIVE LOGITS
    opause
    0.28
    aced
    0.25
    folk
    0.23
    volent
    0.22
    ninger
    0.21
    ubar
    0.19
    -only
    0.19
    aces
    0.17
    insky
    0.16
    orca
    0.16
    Act Density 0.040%

    No Known Activations