INDEX
    Explanations

    The neuron fires on the assistant’s Chinese greeting phrases (e.g., “你好…”) at the start of its replies.

    New Auto-Interp
    Negative Logits
    Cog
    -0.07
    prefix
    -0.06
     map
    -0.06
    -0.06
    an
    -0.06
    Recipe
    -0.06
    ANGE
    -0.06
     urlparse
    -0.06
    /ne
    -0.06
    OX
    -0.06
    POSITIVE LOGITS
     peptides
    0.07
     бороть
    0.07
     produits
    0.07
     bryster
    0.07
    .SetFloat
    0.07
    .isChecked
    0.07
     خودش
    0.07
     حالت
    0.07
    <?↵
    0.06
     história
    0.06
    Act Density 0.017%

    No Known Activations