INDEX
    Explanations

    This neuron never activates—it’s effectively “dead” and doesn’t respond to any token.

    New Auto-Interp
    Negative Logits
     зна
    -0.07
    ्स
    -0.06
     dissertation
    -0.06
     دانش
    -0.06
    xima
    -0.06
    대학교
    -0.06
     IBM
    -0.06
     Maryland
    -0.06
     surv
    -0.06
    urd
    -0.06
    POSITIVE LOGITS
    Range
    0.08
     additive
    0.07
    (Collection
    0.07
     --↵
    0.07
     nickname
    0.07
    (man
    0.07
    -out
    0.07
    -buffer
    0.06
    Array
    0.06
    __(/*!
    0.06
    Act Density 0.007%

    No Known Activations