INDEX
Explanations
Code/system messages
The neuron is activated by the prompt’s section‐separator or heading tokens (e.g. “## Response:”) indicating the start of the assistant’s response section.
New Auto-Interp
Negative Logits
Tal
-0.07
VIII
-0.07
status
-0.07
opting
-0.07
Status
-0.07
={$-0.06
decided
-0.06
_dev
-0.06
Panic
-0.06
Policies
-0.06
POSITIVE LOGITS
AnimationFrame
0.07
.surname
0.06
encoding
0.06
&w
0.06
unlawful
0.06
Ceramic
0.06
truyền
0.06
enerj
0.06
Deutsche
0.06
ejac
0.06
Activations Density 0.036%