INDEX
    Explanations

    This neuron activates on demonstrative references and generic placeholders, especially words like “These” (e.g. “These things,” “These factors”) that point to previously mentioned items.

    New Auto-Interp
    Negative Logits
    άρχ
    -0.06
     radix
    -0.06
    	light
    -0.06
     hacked
    -0.06
     memes
    -0.06
     entreprise
    -0.06
     breve
    -0.06
    Cars
    -0.06
    十一
    -0.06
     foods
    -0.06
    POSITIVE LOGITS
    pal
    0.07
    refer
    0.06
    _IC
    0.06
    】↵
    0.06
    ummy
    0.06
    ’s
    0.06
    coach
    0.06
    odigo
    0.06
     συ
    0.06
    па
    0.06
    Act Density 0.129%

    No Known Activations