INDEX
    Explanations

    qualifying words

    This neuron strongly activates on adverbs—words ending in “-ly.”

    New Auto-Interp
    Negative Logits
     select
    -0.07
     harassment
    -0.07
     climbed
    -0.07
    '.↵
    -0.07
     Dresses
    -0.07
    Discussion
    -0.07
    -about
    -0.07
    Send
    -0.06
     MIC
    -0.06
    |.↵
    -0.06
    POSITIVE LOGITS
     Horizon
    0.06
    0.06
     Ticaret
    0.06
    	Namespace
    0.06
    ्छ
    0.06
     volume
    0.06
     Eug
    0.06
    :"#
    0.06
    ivant
    0.06
     specialize
    0.06
    Act Density 0.111%

    No Known Activations