INDEX
    Explanations

    This neuron activates on speaker-attribution language—especially reporting or quotation verbs like “said,” “explained,” and similar attribution cues.

    New Auto-Interp
    Negative Logits
     fort
    -0.07
    -0.06
    lpVtbl
    -0.06
     Manus
    -0.06
     Sist
    -0.06
     donn
    -0.06
     Lingu
    -0.06
    -0.06
    866
    -0.06
     территории
    -0.06
    POSITIVE LOGITS
     Published
    0.07
    leanup
    0.06
    Converter
    0.06
    .Health
    0.06
    JI
    0.06
    iness
    0.06
     preocup
    0.06
     Коли
    0.06
    .ct
    0.06
    ervation
    0.06
    Act Density 0.057%

    No Known Activations