INDEX
Explanations
The neuron activates specifically on the literal “\[” bracket token that marks the start of the user-provided answer placeholder.
New Auto-Interp
Negative Logits
cerr
-0.06
边
-0.06
Carol
-0.06
.rar
-0.06
sideline
-0.06
Toni
-0.06
.AspNetCore
-0.06
%"↵
-0.06
.cn
-0.06
Poe
-0.06
POSITIVE LOGITS
genre
0.07
/icons
0.07
.wikipedia
0.07
розрах
0.07
concluding
0.06
aqu
0.06
_language
0.06
philosophical
0.06
.Align
0.06
millions
0.06
Activations Density 0.001%