INDEX
Explanations
This neuron activates on programming-style syntax (code tokens and punctuation), effectively detecting source-code segments in the text.
New Auto-Interp
Negative Logits
英語
-0.07
mj
-0.07
Descriptions
-0.07
_sr
-0.07
meer
-0.07
amburger
-0.06
تمر
-0.06
كيل
-0.06
人类
-0.06
halb
-0.06
POSITIVE LOGITS
156
0.07
artic
0.06
oooo
0.06
159
0.06
hesitate
0.06
hect
0.06
];↵↵
0.06
={()=>0.06
194
0.06
178
0.06
Activations Density 0.019%