INDEX
    Explanations

    second-person references and personal connections in the text

    New Auto-Interp
    Negative Logits
    ACHI
    -0.15
    pora
    -0.14
    redicate
    -0.14
    lady
    -0.14
    agna
    -0.14
     themselves
    -0.14
    ehr
    -0.14
    aldo
    -0.14
    lena
    -0.14
    à¸Ĺร
    -0.13
    POSITIVE LOGITS
     oneself
    0.19
    alex
    0.15
    ixel
    0.15
     yourself
    0.14
    .TabStop
    0.14
    _threads
    0.14
     surviv
    0.13
    ledon
    0.13
    ATEST
    0.13
    jav
    0.13
    Act Density 0.403%

    No Known Activations