INDEX
    Explanations

    The neuron fires on second-person references—particularly “you” (and its forms like “your”) addressing the reader.

    New Auto-Interp
    Negative Logits
     bil
    -0.07
    :F
    -0.07
    Fi
    -0.07
    el
    -0.06
    [F
    -0.06
     Fel
    -0.06
    Lim
    -0.06
     ölüm
    -0.06
     Lil
    -0.06
    (reordered
    -0.06
    POSITIVE LOGITS
     you
    0.24
     You
    0.21
    you
    0.19
    You
    0.18
     YOU
    0.17
    .You
    0.17
    -you
    0.14
     your
    0.14
    "You
    0.13
    —you
    0.13
    Act Density 0.746%

    No Known Activations