INDEX
    Explanations

    personal pronouns indicating ownership or affiliation

    pronouns related to personal experience and perspective

    New Auto-Interp
    Negative Logits
    itialized
    -0.74
    quartered
    -0.69
    vine
    -0.68
    cru
    -0.67
    zyme
    -0.66
    orously
    -0.66
     Monstrous
    -0.64
    tyard
    -0.63
    ories
    -0.63
    ĸļ
    -0.63
    POSITIVE LOGITS
     sake
    0.98
     personally
    0.97
     liking
    0.91
    selves
    0.89
     purposes
    0.89
    self
    0.79
     learners
    0.77
    ummies
    0.76
     reasons
    0.74
     selves
    0.72
    Act Density 0.110%

    No Known Activations