INDEX
    Explanations

    code/programming

    The neuron activates on directive or instruction words (e.g. “ONLY,” “Give,” “response”) in the user’s prompt.

    New Auto-Interp
    Negative Logits
     jejichž
    -0.06
     emojis
    -0.06
    .dec
    -0.06
     adversity
    -0.06
    fps
    -0.06
    کش
    -0.06
    -0.06
    ================================================
    -0.06
     ents
    -0.06
    toolbar
    -0.06
    POSITIVE LOGITS
    0.07
     Syracuse
    0.07
     aku
    0.06
     раск
    0.06
    kee
    0.06
     portfolio
    0.06
     french
    0.06
     زیبا
    0.06
     strange
    0.06
     evac
    0.06
    Act Density 0.059%

    No Known Activations