INDEX
    Explanations

    possessive pronouns referring to the speaker's experiences or belongings

    New Auto-Interp
    Negative Logits
    ary
    -0.15
    marks
    -0.15
     yourselves
    -0.14
    mark
    -0.14
    light
    -0.14
    ict
    -0.14
    hower
    -0.14
    markt
    -0.13
    tails
    -0.13
    gether
    -0.13
    POSITIVE LOGITS
    rtle
    0.27
    SELF
    0.24
     own
    0.23
    /her
    0.23
    zelf
    0.22
    opia
    0.21
    self
    0.21
    opic
    0.19
    /us
    0.19
    anmar
    0.19
    Act Density 0.128%

    No Known Activations