INDEX
Explanations
This neuron selectively activates on the token “back” when it appears in instructions about triple back-ticks.
New Auto-Interp
Negative Logits
�
-0.07
Champions
-0.07
ル
-0.07
formats
-0.06
2
-0.06
بوب
-0.06
Modifications
-0.06
Chromium
-0.06
$k
-0.06
champions
-0.06
POSITIVE LOGITS
.isAuthenticated
0.07
loi
0.07
BOOT
0.06
rural
0.06
]])↵
0.06
Git
0.06
머
0.06
');↵↵↵
0.06
familia
0.06
[__
0.06
Activations Density 0.001%