INDEX
    Explanations

    This neuron activates on sentence-initial or clause-linking discourse markers (e.g. “But,” “And,” “Although”) rather than content words.

    New Auto-Interp
    Negative Logits
    _preferences
    -0.07
     interpolation
    -0.07
    .GONE
    -0.06
    -0.06
    .splice
    -0.06
     connected
    -0.06
     Pey
    -0.06
     appliance
    -0.06
     Colorado
    -0.06
     Penny
    -0.06
    POSITIVE LOGITS
     desperately
    0.07
     INTERN
    0.07
     mn
    0.07
     yerleş
    0.06
    ита
    0.06
    اده
    0.06
     وي
    0.06
     dd
    0.06
    0.06
     tất
    0.06
    Act Density 0.474%

    No Known Activations