INDEX
    Explanations

    The neuron fires on the “Q” markers that introduce question prompts.

    New Auto-Interp
    Negative Logits
    -0.07
    HY
    -0.07
     Soup
    -0.07
     Hitch
    -0.07
    _hor
    -0.07
    _CAP
    -0.06
     ΕΠ
    -0.06
    -0.06
    物理
    -0.06
    MMdd
    -0.06
    POSITIVE LOGITS
     Q
    0.11
    Q
    0.09
    ounces
    0.07
     q
    0.07
    .Q
    0.07
     begged
    0.07
    errals
    0.07
     قدر
    0.06
    /q
    0.06
    -Q
    0.06
    Act Density 0.038%

    No Known Activations