INDEX
    Explanations

    conversational

    This neuron flags user queries—tokens appearing in the user’s questions.

    New Auto-Interp
    Negative Logits
    AIL
    -0.07
     Du
    -0.07
    URING
    -0.07
    -0.07
     contar
    -0.07
     Serialize
    -0.06
    ONS
    -0.06
    asil
    -0.06
     Nikki
    -0.06
    vous
    -0.06
    POSITIVE LOGITS
     tog
    0.07
    (shift
    0.06
    ังม
    0.06
     státy
    0.06
    hosts
    0.06
     putt
    0.06
     subparagraph
    0.06
     rubbish
    0.06
    (CONFIG
    0.06
     dez
    0.06
    Act Density 0.063%

    No Known Activations