INDEX
    Explanations

    key concepts related to responsibility, policy, and measurable impacts in social contexts

    New Auto-Interp
    Negative Logits
     Know
    -0.60
    selves
    -0.59
    iatus
    -0.59
    ECA
    -0.59
    zan
    -0.57
    +.
    -0.55
    ochet
    -0.54
    poon
    -0.53
    orea
    -0.53
    isse
    -0.53
    POSITIVE LOGITS
     becomes
    0.87
     disappears
    0.85
     varies
    0.84
    iest
    0.79
     remains
    0.78
     ceases
    0.77
     goes
    0.75
     consists
    0.74
     arises
    0.73
     evolves
    0.73
    Act Density 0.295%

    No Known Activations