INDEX
Explanations
Negation
This neuron detects special metadata tokens—particularly the “<|start_header_id|>” markers.
New Auto-Interp
Negative Logits
/app
-0.08
Brun
-0.07
.EndsWith
-0.06
Parse
-0.06
_defined
-0.06
_Syntax
-0.06
.Components
-0.06
_Context
-0.06
Cached
-0.06
Papers
-0.06
POSITIVE LOGITS
debated
0.07
Debate
0.06
Alban
0.06
conco
0.06
ßerdem
0.06
состоя
0.06
debian
0.06
граду
0.06
앞
0.06
Connector
0.06
Activations Density 0.110%