INDEX
    Explanations

    The neuron fires on tokens marking the bibliography or references section (e.g. “bib,” “bibliography,” or “references.bib”).

    New Auto-Interp
    Negative Logits
     Fairy
    -0.07
    verb
    -0.07
    Exclude
    -0.07
     Sentry
    -0.06
     astr
    -0.06
    HOUSE
    -0.06
     Thy
    -0.06
    hangi
    -0.06
     SOUR
    -0.06
    -0.06
    POSITIVE LOGITS
    λε
    0.07
    (parameters
    0.07
     primera
    0.07
    是不
    0.06
    .tipo
    0.06
     faaliyet
    0.06
    。不
    0.06
    amız
    0.06
     diğer
    0.06
    wk
    0.06
    Act Density 0.001%

    No Known Activations