INDEX
    Explanations

    personal pronouns or possessive pronouns associated with a sense of self

    references to feelings and personal experiences

    New Auto-Interp
    Negative Logits
     Us
    -0.68
     themselves
    -0.66
     Their
    -0.63
    Their
    -0.61
     Plaint
    -0.59
     arsen
    -0.58
    idates
    -0.57
     us
    -0.57
     tariffs
    -0.56
     Diff
    -0.55
    POSITIVE LOGITS
     myself
    1.45
     blogging
    0.93
     my
    0.82
     typing
    0.72
     OCD
    0.71
    watching
    0.69
     researching
    0.68
     writing
    0.68
     personally
    0.66
    aido
    0.66
    Act Density 1.006%

    No Known Activations