INDEX
    Explanations

    Informal conversation

    This neuron activates on first‐person self‐references (e.g. “I” or its conjugated forms in different languages).

    New Auto-Interp
    Negative Logits
    -0.06
    Enumeration
    -0.06
    /simple
    -0.06
     Globals
    -0.06
    δει
    -0.06
     slide
    -0.06
    (Transform
    -0.06
    醴醴
    -0.06
     Filme
    -0.06
    _OPENGL
    -0.06
    POSITIVE LOGITS
     аг
    0.07
     ikea
    0.07
     trouvé
    0.07
    0.06
    فاق
    0.06
    insk
    0.06
     induces
    0.06
     falls
    0.06
    assword
    0.06
     Ο
    0.06
    Act Density 0.036%

    No Known Activations