INDEX
    Explanations

    This neuron fires on speaker‐ or character‐identifier tokens (e.g., “NAME_1”, “NAME_5”, etc.).

    New Auto-Interp
    Negative Logits
     pedal
    -0.07
     Province
    -0.07
     distortion
    -0.07
    --;
    ↵
    -0.07
    ��
    -0.07
    outline
    -0.06
    .creator
    -0.06
    -0.06
    <|python_tag|>
    -0.06
     position
    -0.06
    POSITIVE LOGITS
     EZ
    0.07
    _removed
    0.07
    _INCREMENT
    0.06
     dinosaur
    0.06
     없었
    0.06
     místě
    0.06
    return
    0.06
     Emm
    0.06
     Zoe
    0.06
    .stereotype
    0.06
    Act Density 0.011%

    No Known Activations