INDEX
    Explanations

    This neuron is effectively dead—it never activates on any token.

    New Auto-Interp
    Negative Logits
    }};↵
    -0.08
    ].
    -0.07
     relig
    -0.07
     dossier
    -0.07
    edeyse
    -0.07
    -0.07
    expression
    -0.07
    urlpatterns
    -0.07
     ].
    -0.06
    bridge
    -0.06
    POSITIVE LOGITS
    0.07
    orex
    0.06
    отов
    0.06
    eder
    0.06
     pueden
    0.06
    0.06
     timeouts
    0.06
    0.06
    (dr
    0.06
     grabbed
    0.06
    Act Density 0.019%

    No Known Activations