INDEX
    Explanations

    exclamation points

    This neuron signals on conversational or structural markers in the assistant’s replies—especially greetings (“Hello”), exclamation points, and list‐item numerals (e.g. “1.”, “2.”).

    New Auto-Interp
    Negative Logits
     RTE
    -0.07
    ंबर
    -0.07
    phetamine
    -0.06
    unidad
    -0.06
    ircle
    -0.06
    -0.06
     analý
    -0.06
     predecess
    -0.06
     Trong
    -0.06
     başlay
    -0.06
    POSITIVE LOGITS
     없어
    0.07
     roku
    0.07
     carb
    0.06
     Chloe
    0.06
    0.06
    0.06
    ()?>
    0.06
     표현
    0.06
    „M
    0.06
     scripts
    0.06
    Act Density 0.021%

    No Known Activations