INDEX
    Explanations

    The neuron fires on direct second-person guidance phrases, especially the sequence “you are.”

    New Auto-Interp
    Negative Logits
     Brigade
    -0.07
     feudal
    -0.07
     japan
    -0.07
     الم
    -0.07
    +z
    -0.07
     remove
    -0.07
     dön
    -0.07
    	z
    -0.06
     Dealer
    -0.06
     Panic
    -0.06
    POSITIVE LOGITS
    の上
    0.06
     všem
    0.06
    .Exception
    0.06
     güzel
    0.06
     обо
    0.06
    °}
    0.06
    0.06
     onPress
    0.06
     všichni
    0.06
     geçen
    0.06
    Act Density 0.020%

    No Known Activations