INDEX
    Explanations

    words related to personal actions or self-referential behavior

    references to self-identification or self-involvement

    New Auto-Interp
    Negative Logits
    ammy
    -0.78
    heny
    -0.77
    ulton
    -0.73
    artisan
    -0.73
    illery
    -0.72
    microsoft
    -0.72
    cemic
    -0.70
    apple
    -0.69
    sweet
    -0.69
    rought
    -0.68
    POSITIVE LOGITS
     tremend
    0.85
     profess
    0.79
    selves
    0.75
     underwater
    0.75
    åĤ
    0.73
     submar
    0.71
     ens
    0.70
     personally
    0.69
     worshipped
    0.68
     creatively
    0.67
    Act Density 0.048%

    No Known Activations