INDEX
    Explanations

    This neuron activates on the instruction prompt requiring the answer to begin explicitly with “Yes” or “No.” It detects the directive about how to format the response (starting with "Yes" or "No").

    New Auto-Interp
    Negative Logits
     Tah
    -0.06
    viso
    -0.06
    \Test
    -0.06
     ++;↵
    -0.06
     این
    -0.06
    enou
    -0.06
     printk
    -0.06
     Boehner
    -0.05
    (and
    -0.05
    자의
    -0.05
    POSITIVE LOGITS
    uring
    0.07
     converged
    0.07
     مورد
    0.07
     unidentified
    0.07
     lending
    0.07
    -local
    0.07
    slide
    0.07
     squirrel
    0.07
    RELATED
    0.07
    -party
    0.06
    Act Density 0.008%

    No Known Activations