INDEX
    Explanations

    personal pronouns followed by expressions of belief or opinion

    the pronoun "he" in various contexts

    New Auto-Interp
    Negative Logits
    etheless
    -0.77
    noon
    -0.69
    anking
    -0.68
    conscious
    -0.66
     Maid
    -0.65
    rocket
    -0.64
    cious
    -0.63
    breeding
    -0.63
    berra
    -0.62
     Deaths
    -0.62
    POSITIVE LOGITS
     said
    1.07
     wrote
    1.05
     tweeted
    0.97
    'd
    0.96
    said
    0.95
     joked
    0.93
     Said
    0.91
     says
    0.91
    'll
    0.90
    Said
    0.87
    Act Density 0.067%

    No Known Activations