INDEX
    Explanations

    phrases related to personal autonomy and decision-making

    references to the concept of self-determination or autonomy

    New Auto-Interp
    Negative Logits
     Mub
    -0.72
    onite
    -0.71
    ammy
    -0.70
    ayne
    -0.70
     Derby
    -0.70
    rise
    -0.69
    etta
    -0.69
    vals
    -0.69
    grade
    -0.67
    iard
    -0.67
    POSITIVE LOGITS
     selves
    0.84
     underwater
    0.82
    selves
    0.77
     explan
    0.70
     creatively
    0.70
     fict
    0.70
     altru
    0.68
     conduc
    0.68
     destruct
    0.66
     ashamed
    0.65
    Act Density 0.045%

    No Known Activations