INDEX
    Explanations

    This neuron fires on personal pronouns—especially direct second-person and third-person forms like “you,” “him,” “she,” etc.

    New Auto-Interp
    Negative Logits
    符合
    -0.07
     Systems
    -0.07
     upto
    -0.07
     genocide
    -0.06
    -0.06
     Ree
    -0.06
     ̄ ̄
    -0.06
    าเล
    -0.06
    Separ
    -0.06
    prestashop
    -0.06
    POSITIVE LOGITS
    ДА
    0.06
     marital
    0.06
     altering
    0.06
    صر
    0.06
    ([
    ↵
    0.06
     trolls
    0.06
    .then
    0.06
     úč
    0.06
    /Q
    0.06
    0.06
    Act Density 0.159%

    No Known Activations