INDEX
Explanations
The neuron is looking for those special metadata/control tokens (e.g., header markers like <|start_header_id|>, <|end_header_id|>, and similar system‐formatting tokens).
New Auto-Interp
Negative Logits
(INVOKE
-0.07
utr
-0.07
Allies
-0.06
unteers
-0.06
ibar
-0.06
delegate
-0.06
uable
-0.06
縮
-0.06
referrals
-0.06
acific
-0.06
POSITIVE LOGITS
_inp
0.07
“.
0.06
anos
0.06
pop
0.06
wore
0.06
nodes
0.06
przez
0.06
[loc
0.06
Budapest
0.06
(point
0.06
Activations Density 0.070%