INDEX
    Explanations

    phrases related to human aspects, such as human rights, health, and relationships

    New Auto-Interp
    Negative Logits
    RET
    -0.75
     Transcript
    -0.69
     Buckingham
    -0.68
    forth
    -0.66
    UGE
    -0.62
    ENC
    -0.59
     Scarborough
    -0.59
    eryl
    -0.59
    roller
    -0.58
    etsy
    -0.58
    POSITIVE LOGITS
    itarian
    1.24
     beings
    1.13
    istic
    1.10
    itar
    1.10
    izes
    1.04
    istically
    1.01
    izing
    1.01
     readable
    0.99
    oids
    0.98
    ization
    0.96
    Act Density 4.191%

    No Known Activations