INDEX
Explanations
The neuron activates on universal-quantifier words (e.g. “everyone,” “everything”) that refer to all or the whole.
New Auto-Interp
Negative Logits
briefly
-0.06
little
-0.06
(det
-0.06
768
-0.06
.SetKeyName
-0.06
-haired
-0.06
indictment
-0.06
637
-0.06
(heap
-0.06
-tax
-0.06
POSITIVE LOGITS
supplemental
0.08
ps
0.07
zc
0.07
Manufacturing
0.07
dorm
0.06
房间
0.06
all
0.06
hepsi
0.06
cape
0.06
visible
0.06
Activations Density 0.055%