INDEX
    Explanations

    quotations with pronouns

    This neuron triggers on the first word of a quoted dialogue line (i.e. the token that begins a character’s spoken sentence).

    New Auto-Interp
    Negative Logits
    ?
    -0.07
    !
    -0.07
    +
    -0.07
    :
    -0.06
      ↵    ↵
    -0.06
    ?
    -0.06
     +↵
    -0.06
    t
    -0.06
    \uc
    -0.06
    	
    -0.06
    POSITIVE LOGITS
    "The
    0.08
    "I
    0.08
    "We
    0.08
    “The
    0.07
    \P
    0.07
    'A
    0.07
    زینه
    0.07
     García
    0.07
    "This
    0.07
    “We
    0.07
    Act Density 0.025%

    No Known Activations