INDEX
    Explanations

    forum posts

    This neuron fires on the special boundary token (the newline after the <|end_header_id|>) that marks the start of the assistant’s reply.

    New Auto-Interp
    Negative Logits
     Lawson
    -0.07
    isbn
    -0.06
     illustrations
    -0.06
    Child
    -0.06
    ени
    -0.06
    +'\
    -0.06
     disciples
    -0.06
    yh
    -0.06
    nop
    -0.06
     разреш
    -0.06
    POSITIVE LOGITS
     erectile
    0.06
     overhead
    0.06
    [layer
    0.06
    timing
    0.06
     Metallic
    0.06
     Sociology
    0.06
    ,n
    0.06
     بازیگر
    0.06
    ไว
    0.06
    (Token
    0.06
    Act Density 0.040%

    No Known Activations