INDEX
    Explanations

    self-related terms and concepts

    references to self-identity and self-awareness

    New Auto-Interp
    Negative Logits
     GOODMAN
    -0.75
     Amend
    -0.74
     Nights
    -0.73
     Slay
    -0.72
     Ashe
    -0.71
     Horizon
    -0.71
     Orchestra
    -0.70
     Vid
    -0.69
     Leap
    -0.68
    IUM
    -0.68
    POSITIVE LOGITS
    destruct
    1.20
     destruct
    1.04
    conscious
    0.96
    upload
    0.93
     explanatory
    0.91
    same
    0.82
     esteem
    0.80
    lessly
    0.78
     calcul
    0.77
    contained
    0.77
    Act Density 0.016%

    No Known Activations