INDEX
    Explanations

    The neuron selectively activates on words that occur at the start of sentences or new paragraphs—i.e. sentence-initial tokens.

    New Auto-Interp
    Negative Logits
     Discussions
    -0.07
    erse
    -0.07
     ue
    -0.07
     FOUND
    -0.07
    อำนวย
    -0.06
    ょう
    -0.06
     readFile
    -0.06
    FromFile
    -0.06
     exper
    -0.06
     Tap
    -0.06
    POSITIVE LOGITS
    0.07
     proprietor
    0.06
    figur
    0.06
    0.06
    Disclaimer
    0.06
    >{{$
    0.06
     Humanities
    0.06
    0.06
    Cs
    0.06
    ([$
    0.06
    Act Density 0.466%

    No Known Activations