INDEX
    Explanations

    Withholding/requesting specific information

    The neuron is detecting special control or metadata tokens (e.g., system instructions and header delimiters) in the prompt.

    New Auto-Interp
    Negative Logits
    _z
    -0.07
    _PIN
    -0.07
    essential
    -0.07
    _SSL
    -0.07
    .paint
    -0.06
     alleges
    -0.06
    herited
    -0.06
    figcaption
    -0.06
    ілі
    -0.06
    ancel
    -0.06
    POSITIVE LOGITS
    _bi
    0.06
     сал
    0.06
     tên
    0.06
    ΑΛ
    0.06
     Conservative
    0.06
     unresolved
    0.06
     victories
    0.06
     площ
    0.06
     brushes
    0.06
    κολ
    0.06
    Act Density 0.012%

    No Known Activations