INDEX
    Explanations

    common articles/pronouns

    This neuron activates on the word “In” when it begins a new sentence or paragraph, marking sentence‐initial discourse transitions.

    New Auto-Interp
    Negative Logits
     Sharing
    -0.07
     Syn
    -0.07
     relax
    -0.06
    eding
    -0.06
     Play
    -0.06
    	sd
    -0.06
    asting
    -0.06
    	n
    -0.06
    Playable
    -0.06
     Spot
    -0.06
    POSITIVE LOGITS
    hait
    0.07
    0.07
    (lambda
    0.07
     not
    0.07
    0.06
     ;;↵
    0.06
     Fortunately
    0.06
     puede
    0.06
    Translatef
    0.06
     pla
    0.06
    Act Density 0.219%

    No Known Activations