INDEX
    Explanations

    This neuron primarily activates on the words “they” and “do,” effectively detecting occurrences of the pronoun “they” (often in the phrase “they do”).

    New Auto-Interp
    Negative Logits
     Category
    -0.07
     Tags
    -0.07
     veter
    -0.06
     Privacy
    -0.06
     sneakers
    -0.06
     Guys
    -0.06
    اصل
    -0.06
     saints
    -0.06
     reunion
    -0.06
    ,但
    -0.06
    POSITIVE LOGITS
    shall
    0.06
    setContent
    0.06
    odyn
    0.06
    0.06
    WARDED
    0.06
    лением
    0.06
    0.06
    SB
    0.06
    ίζει
    0.06
     busiest
    0.06
    Act Density 0.000%

    No Known Activations