INDEX
    Explanations

    The neuron fires on semantically rich, content‐bearing tokens (especially verbs and nouns) rather than common function words.

    New Auto-Interp
    Negative Logits
    ्यत
    -0.08
    names
    -0.07
    -only
    -0.07
    only
    -0.07
     submitted
    -0.06
     гаран
    -0.06
    nonnull
    -0.06
     surveys
    -0.06
     harvested
    -0.06
    (messages
    -0.06
    POSITIVE LOGITS
    026
    0.06
    ่านมา
    0.06
    ียว
    0.06
     household
    0.06
     tvoří
    0.06
    quoise
    0.06
     coma
    0.06
    이다
    0.06
     nhu
    0.06
    ulumi
    0.06
    Act Density 0.256%

    No Known Activations