INDEX
    Explanations

    conversational exchanges that involve humor and sarcasm.

    This neuron activates on the phrase “AI assistant,” flagging references to the system or assistant role.

    New Auto-Interp
    Negative Logits
     reusable
    -0.07
     Zones
    -0.07
     Suns
    -0.06
     Operations
    -0.06
    Chr
    -0.06
    -button
    -0.06
    -0.06
     monday
    -0.06
    (Float
    -0.06
     mothers
    -0.06
    POSITIVE LOGITS
    oultry
    0.08
     wallpapers
    0.08
    MenuBar
    0.06
     garg
    0.06
     zdravot
    0.06
    *time
    0.06
     TreeSet
    0.06
    0.06
     jorn
    0.06
    frontend
    0.06
    Act Density 0.002%

    No Known Activations