INDEX
    Explanations

    phrases containing the word "human"

    references to humans and their characteristics or behaviors

    New Auto-Interp
    Negative Logits
    arella
    -0.78
    angles
    -0.75
    RAG
    -0.74
    kick
    -0.74
    forth
    -0.71
    rypt
    -0.70
    wark
    -0.70
    ippi
    -0.69
    ãģĨ
    -0.69
    etsy
    -0.68
    POSITIVE LOGITS
     beings
    1.18
     readable
    0.99
     embryonic
    0.99
    oids
    0.97
    itarian
    0.85
     rights
    0.84
    itar
    0.81
     genome
    0.81
     civilization
    0.80
     fra
    0.78
    Act Density 0.023%

    No Known Activations