INDEX
    Explanations

    references to existential beliefs and philosophical questions about purpose and reality

    New Auto-Interp
    Negative Logits
     يتيمه
    -0.64
     months
    -0.60
     myſelf
    -0.59
     Offisielt
    -0.59
     faſt
    -0.59
     himſelf
    -0.59
     ſta
    -0.58
     sempat
    -0.58
     chofe
    -0.57
    istoitu
    -0.56
    POSITIVE LOGITS
     human
    0.71
     humans
    0.67
     человек
    0.61
    人間の
    0.61
    一個人
    0.60
    humans
    0.59
    所谓
    0.58
    Humans
    0.58
    一个人
    0.58
    human
    0.57
    Act Density 0.473%

    No Known Activations