INDEX
    Explanations

    phrases related to self-identity and self-description

    phrases associated with self-identification or self-description

    New Auto-Interp
    Negative Logits
    ulhu
    -1.15
     "$:/
    -0.79
     Chains
    -0.71
     Hutch
    -0.70
     Boone
    -0.69
     Twain
    -0.68
    OUGH
    -0.68
     Cah
    -0.68
     Springs
    -0.68
     EntityItem
    -0.67
    POSITIVE LOGITS
    talk
    1.12
    imposed
    1.09
    conscious
    1.09
    esteem
    1.04
    destruct
    1.01
    generation
    1.00
    eating
    1.00
    proclaimed
    0.98
    assert
    0.96
    expression
    0.96
    Act Density 0.043%

    No Known Activations