INDEX
    Explanations

    The neuron detects self-referential/reflexive language—tokens and phrases that refer back to the subject itself (e.g., "self", "itself", "talking to itself", "self-...").

    New Auto-Interp
    Negative Logits
     mirrors
    -0.07
     Since
    -0.07
    -0.06
    Since
    -0.06
     variants
    -0.06
     customers
    -0.06
    itation
    -0.06
    "There
    -0.06
     do
    -0.06
    ۱۴
    -0.06
    POSITIVE LOGITS
    ungkin
    0.07
    َح
    0.07
    0.06
    #error
    0.06
    bitmap
    0.06
     unsett
    0.06
     Pathfinder
    0.06
    Tot
    0.06
    Jimmy
    0.06
     olduğ
    0.06
    Act Density 0.234%

    No Known Activations