INDEX
    Explanations

    This neuron detects the presence of instruction‐style or template tokens in the prompt (e.g. header words like “Your,” “should,” “Here,” and numeric placeholders).

    New Auto-Interp
    Negative Logits
     phases
    -0.08
    cts
    -0.07
    orarily
    -0.07
    :m
    -0.07
     timeout
    -0.07
     consecutive
    -0.07
    .tabs
    -0.06
     :(
    -0.06
     aspects
    -0.06
     облад
    -0.06
    POSITIVE LOGITS
     Venom
    0.07
    夫人
    0.06
    	EIF
    0.06
     insufficient
    0.06
    _RECV
    0.06
     uten
    0.06
    vern
    0.06
    erculosis
    0.06
     milf
    0.06
    어나
    0.06
    Act Density 0.008%

    No Known Activations