INDEX
    Explanations

    questions/dialogue

    The neuron responds to tokens in first‐person or quoted internal thoughts (e.g. the opening quotation mark and words like “I,” “thought,” “sure” in introspective statements).

    New Auto-Interp
    Negative Logits
    ється
    -0.06
     vscode
    -0.06
     Vám
    -0.06
    ádu
    -0.06
    _yaw
    -0.06
    ADV
    -0.06
    realm
    -0.06
     cry
    -0.06
     lij
    -0.06
    .Card
    -0.06
    POSITIVE LOGITS
     ';
    ↵
    0.07
     vag
    0.07
    0.06
    0.06
    uate
    0.06
     staging
    0.06
     correctness
    0.06
     hourly
    0.06
     Hispan
    0.06
    -‐
    0.06
    Act Density 0.004%

    No Known Activations