INDEX
    Explanations

    The neuron fires on Wikipedia‐style section headings (e.g. “References,” “External links,” “Category: …”).

    New Auto-Interp
    Negative Logits
     Madagascar
    -0.06
    ön
    -0.06
    _atts
    -0.06
    -0.06
    یان
    -0.06
    ��
    -0.06
    ัฐ
    -0.06
    _imgs
    -0.06
    IVERS
    -0.06
     insensitive
    -0.06
    POSITIVE LOGITS
    ile
    0.07
     fica
    0.07
     velit
    0.06
     measurement
    0.06
    бор
    0.06
    Roll
    0.06
    Scroll
    0.06
     disg
    0.06
    .directive
    0.06
    ーク
    0.06
    Act Density 0.009%

    No Known Activations