INDEX
Explanations
envy and admiration
references to cooking and meal preparation.
This neuron responds to general referencing and explanatory language—especially third‐person pronouns (they, people) and question/explanation words (how, do, make) used in conversational or expository statements.
New Auto-Interp
Negative Logits
decent
-0.07
board
-0.07
ред
-0.07
427
-0.07
422
-0.06
ddie
-0.06
Joey
-0.06
subscription
-0.06
Sun
-0.06
conditions
-0.06
POSITIVE LOGITS
.isdigit
0.07
Gaut
0.06
cmb
0.06
teness
0.06
.es
0.06
λό
0.06
diner
0.06
тол
0.06
quantify
0.06
'}, ↵
0.06
Activations Density 0.035%