INDEX
    Explanations

    words related to personal experiences and challenges

    expressions related to critical thinking and social awareness

    New Auto-Interp
    Negative Logits
     barring
    -0.67
     theirs
    -0.66
    except
    -0.65
    amera
    -0.64
    ARE
    -0.62
    warning
    -0.62
     alleging
    -0.62
     urging
    -0.60
     supplying
    -0.60
    uploads
    -0.59
    POSITIVE LOGITS
     oneself
    1.34
     yourself
    1.02
     Yourself
    0.93
    azeera
    0.74
     myself
    0.72
    arenthood
    0.71
     shitty
    0.70
     hindsight
    0.66
     entails
    0.65
     mates
    0.63
    Act Density 0.644%

    No Known Activations