INDEX
    Explanations

    This neuron detects numbered placeholder tokens (e.g. “NAME_1”, “NAME_2”) used to label speakers or items.

    New Auto-Interp
    Negative Logits
     IRC
    -0.07
    Authority
    -0.07
     EventBus
    -0.07
     nextState
    -0.06
     unusually
    -0.06
    -0.06
    建立
    -0.06
    arlo
    -0.06
    .On
    -0.06
     pasado
    -0.06
    POSITIVE LOGITS
     dwarf
    0.06
     Сан
    0.06
     medicine
    0.06
     diversion
    0.06
    =random
    0.06
    ırak
    0.06
    OLID
    0.06
     depressive
    0.06
    good
    0.06
    ΙΟ
    0.06
    Act Density 0.005%

    No Known Activations