INDEX
    Explanations

    The neuron activates on universal-quantifier words (e.g. “everyone,” “everything”) that refer to all or the whole.

    New Auto-Interp
    Negative Logits
     briefly
    -0.06
     little
    -0.06
    (det
    -0.06
    768
    -0.06
    .SetKeyName
    -0.06
    -haired
    -0.06
     indictment
    -0.06
    637
    -0.06
    (heap
    -0.06
    -tax
    -0.06
    POSITIVE LOGITS
     supplemental
    0.08
    ps
    0.07
    zc
    0.07
     Manufacturing
    0.07
     dorm
    0.06
    房间
    0.06
     all
    0.06
     hepsi
    0.06
    cape
    0.06
     visible
    0.06
    Act Density 0.055%

    No Known Activations