INDEX
    Explanations

    The neuron fires on Wikipedia‐style category tags (i.e. lines beginning with “Category:”).

    New Auto-Interp
    Negative Logits
    supports
    -0.07
    stashop
    -0.07
     redirects
    -0.06
     promote
    -0.06
     freder
    -0.06
    changed
    -0.06
     derive
    -0.06
     realized
    -0.06
    드리
    -0.06
    volent
    -0.06
    POSITIVE LOGITS
     Id
    0.07
     nicotine
    0.07
    itten
    0.07
    _ind
    0.06
    Issues
    0.06
    ousing
    0.06
     Jal
    0.06
     Habitat
    0.06
     aos
    0.06
     longitudinal
    0.06
    Act Density 0.008%

    No Known Activations