INDEX
Explanations
instructions
The neuron fires on the key “object‐and‐parameter” words in step‐by‐step instructions—terms like “module,” “new,” and “name” that label what you’re renaming or configuring.
New Auto-Interp
Negative Logits
leaned
-0.07
Puzzle
-0.07
Ped
-0.07
dummy
-0.07
covid
-0.06
پاد
-0.06
MUX
-0.06
nin
-0.06
That
-0.06
Ts
-0.06
POSITIVE LOGITS
генера
0.07
"/> ↵
0.07
.pick
0.07
ondere
0.07
цієн
0.07
.tom
0.07
review
0.07
одну
0.06
菌
0.06
#\
0.06
Activations Density 0.202%