INDEX
    Explanations

    This neuron detects numbered list markers/section-step numerals that introduce ordered list items or steps.

    New Auto-Interp
    Negative Logits
    好看
    -0.07
    -0.07
     Teuchos
    -0.07
     swe
    -0.07
    但对于
    -0.07
    But
    -0.07
    ()._
    -0.07
     encountering
    -0.06
    pto
    -0.06
    一个好的
    -0.06
    POSITIVE LOGITS
     distances
    0.07
     Payload
    0.07
     legacy
    0.07
    0.07
    .magic
    0.07
    安东尼
    0.07
     Alex
    0.07
     challenge
    0.07
    ALCHEMY
    0.07
     spinach
    0.07
    Act Density 0.058%

    No Known Activations