INDEX
Explanations
exclamation points
This neuron signals on conversational or structural markers in the assistant’s replies—especially greetings (“Hello”), exclamation points, and list‐item numerals (e.g. “1.”, “2.”).
New Auto-Interp
Negative Logits
RTE
-0.07
ंबर
-0.07
phetamine
-0.06
unidad
-0.06
ircle
-0.06
▲
-0.06
analý
-0.06
predecess
-0.06
Trong
-0.06
başlay
-0.06
POSITIVE LOGITS
없어
0.07
roku
0.07
carb
0.06
Chloe
0.06
公
0.06
択
0.06
()?>
0.06
표현
0.06
„M
0.06
scripts
0.06
Activations Density 0.021%