INDEX
    Explanations

    This neuron responds to the appearance of the verb “features” (and its close variants) in descriptive sentences.

    New Auto-Interp
    Negative Logits
     clamp
    -0.07
     ANSW
    -0.07
     global
    -0.07
    (zip
    -0.07
     yyn
    -0.06
     bind
    -0.06
     takes
    -0.06
     suppress
    -0.06
     giveaways
    -0.06
     escape
    -0.06
    POSITIVE LOGITS
     featuring
    0.10
     features
    0.08
     Featuring
    0.07
    aysia
    0.07
    porto
    0.07
    uest
    0.07
    0.07
    stein
    0.06
     perfection
    0.06
     Features
    0.06
    Act Density 0.012%

    No Known Activations