INDEX
Explanations
questions
This neuron activates on the model’s internal control markers and separators (e.g. <|eot_id|>, <|start_header_id|>, special header or footer tokens), flagging document-structure tokens rather than normal text.
New Auto-Interp
Negative Logits
'o
-0.07
*z
-0.07
Writer
-0.07
número
-0.07
شکن
-0.06
bankrupt
-0.06
hat
-0.06
Eff
-0.06
査
-0.06
ведите
-0.06
POSITIVE LOGITS
/use
0.07
гар
0.07
atra
0.07
013
0.06
click
0.06
constitutes
0.06
ुछ
0.06
DataAdapter
0.06
submit
0.06
anganese
0.06
Activations Density 0.049%