INDEX
    Explanations

    phrases related to self-identity and self-awareness

    references to identity and self-perception

    New Auto-Interp
    Negative Logits
     Refresh
    -0.69
    uish
    -0.69
     refresh
    -0.66
    heny
    -0.65
    teness
    -0.65
     Highlights
    -0.63
    idates
    -0.62
     Dism
    -0.61
     stray
    -0.61
     Kills
    -0.59
    POSITIVE LOGITS
     supposed
    0.97
     gonna
    0.86
     destined
    0.85
     going
    0.83
     doing
    0.82
     able
    0.81
     happening
    0.78
     presented
    0.77
     weakest
    0.76
     experiencing
    0.76
    Act Density 0.157%

    No Known Activations