INDEX
Explanations
conversation
The neuron consistently lights up on the instructional “conversation” and related context words in the prompt (e.g. “Below is a conversation,” “Understand the conversation,” “Based on the conversation”), indicating it’s detecting those system‐instruction or prompt framing lines.
New Auto-Interp
Negative Logits
�
-0.06
WidgetItem
-0.06
965
-0.06
ولی
-0.06
短
-0.06
Nex
-0.06
�
-0.06
_items
-0.06
armacy
-0.06
Lawyers
-0.06
POSITIVE LOGITS
FileDialog
0.07
developing
0.07
ops
0.07
bastian
0.07
polynomial
0.06
------------------------------
0.06
demonstrates
0.06
COLLECTION
0.06
địch
0.06
Consumption
0.06
Activations Density 0.006%