INDEX
    Explanations

    personal pronouns followed by a verb

    instances of the pronoun "He."

    New Auto-Interp
    Negative Logits
    vable
    -0.65
    SPONSORED
    -0.59
    /$
    -0.58
    OUR
    -0.54
    AGES
    -0.54
    OUS
    -0.53
    ARCH
    -0.52
    ADS
    -0.52
    ARB
    -0.51
    vous
    -0.51
    POSITIVE LOGITS
     He
    2.88
     His
    2.27
    He
    2.15
     Himself
    1.81
     Him
    1.79
    His
    1.72
     She
    1.64
    he
    1.44
     he
    1.41
     HE
    1.30
    Act Density 0.075%

    No Known Activations