INDEX
    Explanations

    terms addressing audiences in a formal setting

    mentions of "ladies" and "gentlemen."

    New Auto-Interp
    Negative Logits
    Ds
    -0.79
    osta
    -0.70
    yrus
    -0.68
    onis
    -0.68
    aya
    -0.67
    ython
    -0.67
    icted
    -0.66
    Emb
    -0.65
    sequence
    -0.65
    aeda
    -0.65
    POSITIVE LOGITS
     gentlemen
    0.89
    maid
    0.85
     gentleman
    0.83
    utenant
    0.77
    men
    0.75
    woman
    0.75
    owship
    0.75
     Gaga
    0.74
    bugs
    0.72
     Toast
    0.70
    Act Density 0.015%

    No Known Activations