INDEX
Explanations
This neuron fires on tokens in first‐person posts where the author describes or points to their own code/tool (e.g. “I wrote…”, “solution”, repo URLs).
New Auto-Interp
Negative Logits
ो
-0.07
opez
-0.06
-0.06
ensaje
-0.06
_OPTS
-0.06
Dodgers
-0.06
Siege
-0.06
zám
-0.06
Neill
-0.06
.Consumer
-0.06
POSITIVE LOGITS
한
0.07
evenly
0.06
slice
0.06
ainsi
0.06
stellen
0.06
_invalid
0.06
acum
0.06
imestone
0.06
alarında
0.06
ujeme
0.06
Activations Density 0.048%