INDEX
    Explanations

    This neuron primarily activates on placeholder entity tokens (e.g. “NAME,” “NAME_2,” etc.) rather than ordinary words.

    New Auto-Interp
    Negative Logits
     outings
    -0.07
    PRS
    -0.07
     Pink
    -0.06
    بس
    -0.06
    Twenty
    -0.06
    -0.06
     Gavin
    -0.06
    %).↵↵
    -0.06
     matchups
    -0.06
     peanuts
    -0.06
    POSITIVE LOGITS
     converter
    0.06
    _agent
    0.06
    .dll
    0.06
    _DATA
    0.06
    Undo
    0.06
     forgiveness
    0.06
     uvol
    0.06
     overrides
    0.06
     Cmd
    0.06
    uat
    0.06
    Act Density 0.011%

    No Known Activations