INDEX
    Explanations

    This neuron detects the definite article “the.”

    New Auto-Interp
    Negative Logits
     koruy
    -0.07
    -0.07
    (rot
    -0.07
    _B
    -0.07
    -0.07
     limiting
    -0.07
    ۸
    -0.06
     increasingly
    -0.06
    .app
    -0.06
    ۹
    -0.06
    POSITIVE LOGITS
    ObjectName
    0.06
     standoff
    0.06
    _ioctl
    0.06
    .httpClient
    0.06
     Sampler
    0.06
     defenses
    0.06
     spouses
    0.06
    šní
    0.06
    -the
    0.06
    ımın
    0.06
    Act Density 0.037%

    No Known Activations