INDEX
    Explanations

    The neuron selectively activates on in‐text citation markers and reference labels (e.g. bracketed “[@HS…]” tokens and author‐initial tags).

    New Auto-Interp
    Negative Logits
     comm
    -0.07
    _inf
    -0.07
     Sawyer
    -0.06
     insured
    -0.06
     donation
    -0.06
    rust
    -0.06
     affidavit
    -0.06
    Replacing
    -0.06
     Raid
    -0.06
     vos
    -0.06
    POSITIVE LOGITS
     doğrult
    0.07
    0.06
     Profes
    0.06
    dem
    0.06
     здійс
    0.06
     Değ
    0.06
     Vous
    0.06
     بم
    0.06
    าษฎ
    0.06
     Friendship
    0.06
    Act Density 0.014%

    No Known Activations