INDEX
    Explanations

    The neuron activates on occurrences of the word “question” (and its surrounding punctuation) in the prompt text.

    New Auto-Interp
    Negative Logits
    -0.07
    mirror
    -0.07
    arnation
    -0.07
    با
    -0.07
    apping
    -0.07
     viewPager
    -0.07
     inauguration
    -0.07
     />,↵
    -0.07
     BR
    -0.07
    xCA
    -0.07
    POSITIVE LOGITS
     minimized
    0.06
     undert
    0.06
     conqu
    0.06
    월까지
    0.06
    χεί
    0.06
     discrepancies
    0.06
     QIcon
    0.06
    0.05
    ://{
    0.05
    -dd
    0.05
    Act Density 0.011%

    No Known Activations