INDEX
Explanations
Withholding/requesting specific information
The neuron is detecting special control or metadata tokens (e.g., system instructions and header delimiters) in the prompt.
New Auto-Interp
Negative Logits
_z
-0.07
_PIN
-0.07
essential
-0.07
_SSL
-0.07
.paint
-0.06
alleges
-0.06
herited
-0.06
figcaption
-0.06
ілі
-0.06
ancel
-0.06
POSITIVE LOGITS
_bi
0.06
сал
0.06
tên
0.06
ΑΛ
0.06
Conservative
0.06
unresolved
0.06
victories
0.06
площ
0.06
brushes
0.06
κολ
0.06
Activations Density 0.012%