INDEX
    Explanations

    references to the human body

    references to the concept of "human" and its characteristics or existence

    New Auto-Interp
    Negative Logits
    urations
    -0.75
    liga
    -0.70
    arella
    -0.68
    creen
    -0.67
    eryl
    -0.66
    OHN
    -0.66
    RAG
    -0.66
    uden
    -0.65
    è¦
    -0.64
    chell
    -0.64
    POSITIVE LOGITS
     beings
    1.44
    itarian
    1.17
    itar
    1.14
    oids
    1.11
    istic
    1.06
    izing
    0.95
     readable
    0.94
     embryonic
    0.94
    ized
    0.94
    oid
    0.93
    Act Density 0.030%

    No Known Activations