INDEX
    Explanations

    The neuron is primarily activated by occurrences of the subword “bit” (as in “a bit of…”).

    New Auto-Interp
    Negative Logits
    ckt
    -0.07
    -0.07
    https
    -0.06
    _threads
    -0.06
    -0.06
    alled
    -0.06
    ouncement
    -0.06
    uslim
    -0.06
     Organizations
    -0.06
    oled
    -0.06
    POSITIVE LOGITS
     conco
    0.07
     правило
    0.06
     ditch
    0.06
     쪽지
    0.06
     aesthetics
    0.06
     DISP
    0.06
    0.06
     practitioner
    0.06
     of
    0.06
    주의
    0.06
    Act Density 0.009%

    No Known Activations