INDEX
    Explanations

    This neuron activates on the model’s internal control markers and separators (e.g. <|eot_id|>, <|start_header_id|>, special header or footer tokens), flagging document-structure tokens rather than normal text.

    New Auto-Interp
    Negative Logits
    'o
    -0.07
    *z
    -0.07
     Writer
    -0.07
     número
    -0.07
     شکن
    -0.06
     bankrupt
    -0.06
     hat
    -0.06
    Eff
    -0.06
    -0.06
    ведите
    -0.06
    POSITIVE LOGITS
    /use
    0.07
    гар
    0.07
    atra
    0.07
    013
    0.06
    click
    0.06
     constitutes
    0.06
    ुछ
    0.06
    DataAdapter
    0.06
    submit
    0.06
    anganese
    0.06
    Act Density 0.049%

    No Known Activations