INDEX
Explanations
conversation snippets
The neuron fires most strongly on list‐item or step numbers (e.g. “1.”, “2.”, “3.”) used to enumerate or order parts of instructions.
New Auto-Interp
Negative Logits
highways
-0.06
Convention
-0.06
pz
-0.06
ior
-0.06
wk
-0.06
intimacy
-0.06
400
-0.06
-making
-0.06
ogui
-0.06
veled
-0.06
POSITIVE LOGITS
І
0.07
ласти
0.07
meetup
0.07
.↵↵↵↵↵↵↵↵↵↵↵↵
0.07
екти
0.07
historia
0.07
,比
0.06
doivent
0.06
])){↵0.06
){↵↵0.06
Activations Density 0.032%