INDEX
    Explanations

    instances of the pronoun "he" and its variations

    New Auto-Interp
    Negative Logits
    were
    -0.20
     from
    -0.19
     itself
    -0.18
     on
    -0.17
     during
    -0.15
    ly
    -0.15
    isContained
    -0.15
    nbsp
    -0.15
    ness
    -0.15
     across
    -0.15
    POSITIVE LOGITS
    'd
    0.64
    'll
    0.61
    ’d
    0.53
    /she
    0.50
    ’ll
    0.50
    've
    0.44
    're
    0.41
    eding
    0.40
    's
    0.39
     himself
    0.36
    Act Density 0.340%

    No Known Activations