INDEX
    Explanations

    pronouns associated with male and female characters

    New Auto-Interp
    Negative Logits
     Monfieur
    -0.81
     Theſe
    -0.80
     Conſ
    -0.79
     Perſ
    -0.78
     faſt
    -0.77
     pleaſure
    -0.77
     Reſ
    -0.77
     ſeveral
    -0.74
     Efq
    -0.74
     Eſ
    -0.74
    POSITIVE LOGITS
     he
    2.51
     He
    1.96
     she
    1.95
    He
    1.83
     his
    1.64
     they
    1.61
     him
    1.50
    She
    1.47
     She
    1.44
    she
    1.40
    Act Density 0.156%

    No Known Activations