INDEX
    Explanations

    This neuron activates on longer, domain-specific nouns—especially proper names or technical terms.

    New Auto-Interp
    Negative Logits
    lever
    -0.06
    weak
    -0.06
    Work
    -0.06
    "In
    -0.06
    Kid
    -0.06
     looks
    -0.06
    ico
    -0.06
     pancreatic
    -0.06
    ався
    -0.06
    output
    -0.06
    POSITIVE LOGITS
     nak
    0.07
     Му
    0.07
     aşam
    0.07
    pmat
    0.07
     bury
    0.07
    .CompareTo
    0.06
     nez
    0.06
    0.06
    ЛО
    0.06
     husus
    0.06
    Act Density 0.358%

    No Known Activations