INDEX
Explanations
This neuron fires on programming‐language syntax—particularly assignment, method‐call, and bracket tokens in code snippets.
New Auto-Interp
Negative Logits
animated
-0.08
widget
-0.07
WRITE
-0.07
seizing
-0.07
đáo
-0.07
trailers
-0.06
configure
-0.06
eggs
-0.06
Beer
-0.06
joys
-0.06
POSITIVE LOGITS
LOOK
0.06
_slow
0.06
①
0.06
ционный
0.06
insurance
0.06
texas
0.06
anten
0.06
anchor
0.06
vừa
0.06
_sd
0.06
Activations Density 0.099%