INDEX
    Explanations

    This neuron flags polite first-person desire or preference constructions, especially phrases like “I would like to ….”

    New Auto-Interp
    Negative Logits
     kırmızı
    -0.06
    _axes
    -0.06
    ivering
    -0.06
    нения
    -0.06
     Osama
    -0.06
     Ethernet
    -0.06
     výkon
    -0.06
    vious
    -0.06
    등학교
    -0.06
    _fc
    -0.06
    POSITIVE LOGITS
     ung
    0.07
     Rip
    0.07
    	value
    0.06
    indic
    0.06
    Traditional
    0.06
    	tr
    0.06
     clinic
    0.06
    ellery
    0.06
    cessive
    0.06
    ierrez
    0.06
    Act Density 0.013%

    No Known Activations