INDEX
    Explanations

    references to experiences or actions in a first-person perspective

    references to the concept of "person" in various contexts

    New Auto-Interp
    Negative Logits
    DERR
    -0.73
    enthal
    -0.69
    Tx
    -0.68
    ORK
    -0.66
    tty
    -0.65
     Mb
    -0.65
    YP
    -0.64
    CCC
    -0.64
     avoidance
    -0.63
    Phill
    -0.63
    POSITIVE LOGITS
    nel
    1.08
    hood
    1.01
    ality
    0.94
    atives
    0.92
    uscript
    0.87
    alities
    0.86
    izontal
    0.85
    acles
    0.81
    atural
    0.80
    alties
    0.78
    Act Density 0.032%

    No Known Activations