INDEX
    Explanations

    references to significant male figures

    New Auto-Interp
    Negative Logits
    puter
    -0.18
    ingly
    -0.15
    ted
    -0.15
    æķ£
    -0.15
    itionally
    -0.14
    agy
    -0.14
    syn
    -0.14
    itte
    -0.14
    lectric
    -0.14
    gether
    -0.14
    POSITIVE LOGITS
    ufac
    0.20
    hattan
    0.19
    /her
    0.17
    opause
    0.16
    agements
    0.16
    hunt
    0.16
    iac
    0.16
    agment
    0.15
    äh
    0.15
    ne
    0.14
    Act Density 0.143%

    No Known Activations