INDEX
Explanations
conditioning
This neuron detects special format markers (e.g. the “<|start_header_id|>” and related control‐sequence tokens) delineating header or metadata sections.
New Auto-Interp
Negative Logits
ILL
-0.07
BO
-0.07
portray
-0.07
ボ
-0.06
sanctioned
-0.06
へ
-0.06
ώντας
-0.06
iphy
-0.06
uego
-0.06
丶
-0.06
POSITIVE LOGITS
src
0.07
næ
0.07
sburgh
0.06
istical
0.06
.SetBool
0.06
사업
0.06
.external
0.06
北市
0.06
}";↵
0.06
victories
0.06
Activations Density 0.502%