INDEX
    Explanations

    instances of personal pronouns, particularly "I" and "we"

    New Auto-Interp
    Negative Logits
    fol
    -0.15
    CC
    -0.15
    chin
    -0.15
     Loving
    -0.14
    -loving
    -0.14
    altern
    -0.14
     spray
    -0.14
    ervlet
    -0.14
     Duffy
    -0.14
    oplan
    -0.14
    POSITIVE LOGITS
    iag
    0.16
    iams
    0.15
    ENER
    0.15
    aits
    0.15
    uron
    0.15
     sop
    0.14
    XHR
    0.14
    mland
    0.14
    SENS
    0.14
    ieu
    0.14
    Act Density 0.390%

    No Known Activations