INDEX
    Explanations

    negative feelings

    The neuron activates on language that describes a consuming or destructive takeover—words signaling something’s powerful, overwhelming impact (e.g. “toll,” “takes,” “take over,” “power”).

    New Auto-Interp
    Negative Logits
    _on
    -0.07
     varios
    -0.06
    indx
    -0.06
    _xt
    -0.06
     нам
    -0.06
    	x
    -0.06
     confusion
    -0.06
     milk
    -0.06
     cri
    -0.06
    	
    -0.06
    POSITIVE LOGITS
    MAIL
    0.07
    Opaque
    0.06
     TODAY
    0.06
     STD
    0.06
     اپ
    0.06
    Dragging
    0.06
    entionPolicy
    0.06
    0.06
    _BOTTOM
    0.06
    progress
    0.06
    Act Density 0.067%

    No Known Activations