INDEX
Explanations
Problems and challenges
This neuron fires on first-person self-references (I, me, am) in user requests.
sentences that express a first‑/second‑person conversational turn asking for help or clarifying questions (i.e., interactive, dialogic requests and responses).
New Auto-Interp
Negative Logits
ीण
-0.07
traps
-0.06
opioid
-0.06
�
-0.06
odpověd
-0.06
_z
-0.06
%X
-0.06
numb
-0.06
ecer
-0.06
От
-0.06
POSITIVE LOGITS
chez
0.07
Cum
0.07
uplifting
0.06
Advisor
0.06
pired
0.06
standby
0.06
excessive
0.06
portal
0.06
cropping
0.06
aspect
0.06
Activations Density 0.131%