INDEX
    Explanations

    Code/Configuration snippets

    The neuron detects tokens that are part of placeholder‐style prompts (e.g. “Inserire,” “Insert,” “Insira,” etc.) asking the user to insert or fill in content.

    New Auto-Interp
    Negative Logits
    令人
    -0.07
    your
    -0.07
    -0.06
    hattan
    -0.06
    AX
    -0.06
    še
    -0.06
     cooled
    -0.06
    KNOWN
    -0.06
     estoy
    -0.06
    ателя
    -0.06
    POSITIVE LOGITS
    muz
    0.06
     Ups
    0.06
    .$$
    0.06
     blessing
    0.06
    自分
    0.06
    _SP
    0.06
    gil
    0.06
    usunda
    0.06
     TD
    0.06
    0.05
    Act Density 0.021%

    No Known Activations