INDEX
    Explanations

    conditioning

    This neuron detects special format markers (e.g. the “<|start_header_id|>” and related control‐sequence tokens) delineating header or metadata sections.

    New Auto-Interp
    Negative Logits
    ILL
    -0.07
    BO
    -0.07
     portray
    -0.07
    -0.06
     sanctioned
    -0.06
    -0.06
    ώντας
    -0.06
    iphy
    -0.06
    uego
    -0.06
    -0.06
    POSITIVE LOGITS
    	src
    0.07
    0.07
    sburgh
    0.06
    istical
    0.06
    .SetBool
    0.06
     사업
    0.06
    .external
    0.06
    北市
    0.06
    }";↵
    0.06
     victories
    0.06
    Act Density 0.502%

    No Known Activations