INDEX
Explanations
code and queries
The neuron detects instructional or prompt‐related terms in a numerical QA task setup—words like “arithmetic,” “program,” “question,” “Answer,” and “Generate” that form the meta‐instructions for composing contexts and calculating answers.
New Auto-Interp
Negative Logits
.which
-0.07
.findall
-0.07
lightbox
-0.06
/compiler
-0.06
_display
-0.06
Returned
-0.06
輸
-0.06
woff
-0.06
report
-0.06
find
-0.06
POSITIVE LOGITS
κορ
0.07
(ierr
0.06
denote
0.06
λιά
0.06
_FORWARD
0.06
ROTO
0.06
=sum
0.06
largo
0.06
errs
0.06
/usr
0.06
Activations Density 0.008%