INDEX
Explanations
This neuron activates on programming code tokens, i.e. parts of the text containing code examples or code-like syntax.
New Auto-Interp
Negative Logits
Injection
-0.07
vitamins
-0.07
界
-0.07
_band
-0.07
われ
-0.06
-sex
-0.06
цять
-0.06
zw
-0.06
_DR
-0.06
activation
-0.06
POSITIVE LOGITS
"%(
0.07
род
0.07
.Annotation
0.06
(G
0.06
yüksek
0.06
ayn
0.06
خد
0.06
코
0.06
(sensor
0.06
Derek
0.05
Activations Density 0.003%