INDEX
    Explanations

    pronouns and verbs related to self-identity

    references to identity and self-perception

    New Auto-Interp
    Negative Logits
    artifacts
    -0.62
     diffusion
    -0.61
     Grounds
    -0.60
     liner
    -0.60
    hill
    -0.59
     Passage
    -0.57
     horizont
    -0.57
    UCT
    -0.56
     sheds
    -0.55
     adv
    -0.55
    POSITIVE LOGITS
    borgh
    0.76
     become
    0.72
     am
    0.71
    uably
    0.70
    Thumbnail
    0.68
     Become
    0.66
     truly
    0.66
    ãĤ«
    0.63
     pretended
    0.62
     aspire
    0.62
    Act Density 0.117%

    No Known Activations