INDEX
    Explanations

    This neuron does not activate on any tokens—it remains effectively silent and does not detect any patterns.

    New Auto-Interp
    Negative Logits
     poles
    -0.07
     capitalize
    -0.07
     fare
    -0.06
     цвет
    -0.06
     dirt
    -0.06
    Baseline
    -0.06
    Parallel
    -0.06
    dead
    -0.06
    ides
    -0.06
    enchmark
    -0.06
    POSITIVE LOGITS
    (jq
    0.07
    _TEMPLATE
    0.06
    ,eg
    0.06
     hg
    0.06
    0.06
    “그
    0.06
    raig
    0.06
    ع
    0.06
    authentication
    0.06
     nossa
    0.06
    Act Density 0.001%

    No Known Activations