INDEX
    Explanations

    phrases related to opinions or statements made by a specific person

    references to the pronoun "he"

    New Auto-Interp
    Negative Logits
    noon
    -0.84
    etheless
    -0.75
    acters
    -0.72
    rocket
    -0.71
    evidence
    -0.64
    Operation
    -0.64
    selection
    -0.60
     totality
    -0.60
     fractions
    -0.60
    lihood
    -0.57
    POSITIVE LOGITS
     said
    1.43
     wrote
    1.33
     says
    1.24
     joked
    1.17
     exclaimed
    1.15
    said
    1.14
     tweeted
    1.14
     explained
    1.13
     told
    1.12
     replied
    1.08
    Act Density 0.065%

    No Known Activations