INDEX
    Explanations

    personal pronouns used for self-referencing actions

    phrases that contain the word "themselves."

    New Auto-Interp
    Negative Logits
    grade
    -0.75
    amia
    -0.73
    order
    -0.70
    execute
    -0.69
    ulton
    -0.68
    aster
    -0.67
    asia
    -0.67
    ritz
    -0.66
    pour
    -0.66
    pak
    -0.65
    POSITIVE LOGITS
    selves
    1.18
     tremend
    0.98
     selves
    0.93
    self
    0.91
     exting
    0.88
     themselves
    0.87
     conduc
    0.86
     proport
    0.84
     exha
    0.82
     behavi
    0.79
    Act Density 0.027%

    No Known Activations