INDEX
    Explanations

    pronouns or nouns related to self-representation or self-identification

    occurrences of the word "themselves."

    New Auto-Interp
    Negative Logits
     Sierra
    -0.72
    aster
    -0.69
     Derby
    -0.67
    amia
    -0.66
    etta
    -0.66
     Fulton
    -0.65
     CASE
    -0.65
     Syndicate
    -0.64
     Rail
    -0.64
     Ki
    -0.64
    POSITIVE LOGITS
    selves
    1.24
     selves
    1.01
     tremend
    0.80
    self
    0.79
     conduc
    0.79
     creatively
    0.78
     underwater
    0.77
     themselves
    0.74
     proport
    0.74
     spontaneously
    0.73
    Act Density 0.037%

    No Known Activations