INDEX
Explanations
neural network code
The neuron strongly activates on tokens that occur inside the assistant’s enumerated, step‐by‐step or list‐style instructions (e.g. “Day 1: …,” “Round 2: …,” bullet‐point steps), and remains silent on plain user or system text.
New Auto-Interp
Negative Logits
солн
-0.07
OutOfBoundsException
-0.07
ุญ
-0.06
自身
-0.06
terrestrial
-0.06
ler
-0.06
Cheers
-0.06
게
-0.06
Pron
-0.06
ционной
-0.06
POSITIVE LOGITS
urable
0.07
dbName
0.07
Cognitive
0.06
.Gen
0.06
_sector
0.06
>I
0.06
struction
0.06
wall
0.06
patent
0.06
.CH
0.06
Activations Density 0.005%