INDEX
    Explanations

    academic citations

    The neuron is primarily activating on numeric tokens (e.g. volume, page, year, and other multi‐digit numbers).

    New Auto-Interp
    Negative Logits
    CA
    -0.09
     Cox
    -0.08
    c
    -0.08
     Oak
    -0.08
    cro
    -0.08
    aco
    -0.08
    ac
    -0.08
     Arc
    -0.08
    noc
    -0.07
    cale
    -0.07
    POSITIVE LOGITS
    7
    0.13
     seven
    0.09
    107
    0.09
    197
    0.09
    97
    0.09
    0.09
    307
    0.08
    0.08
    27
    0.08
    0.08
    Act Density 0.237%

    No Known Activations