INDEX
Explanations
forum posts
This neuron fires on the special boundary token (the newline after the <|end_header_id|>) that marks the start of the assistant’s reply.
New Auto-Interp
Negative Logits
Lawson
-0.07
isbn
-0.06
illustrations
-0.06
Child
-0.06
ени
-0.06
+'\
-0.06
disciples
-0.06
yh
-0.06
nop
-0.06
разреш
-0.06
POSITIVE LOGITS
erectile
0.06
overhead
0.06
[layer
0.06
timing
0.06
Metallic
0.06
Sociology
0.06
,n
0.06
بازیگر
0.06
ไว
0.06
(Token
0.06
Activations Density 0.040%