INDEX
Explanations
small amounts accumulating
This neuron responds to instances of additive accumulation phrasing—especially the idiomatic “add up” (and its immediate context).
New Auto-Interp
Negative Logits
-html
-0.06
лід
-0.06
xies
-0.06
हर
-0.06
สมเด
-0.06
focused
-0.06
냥
-0.06
.@
-0.06
Дж
-0.06
vfs
-0.06
POSITIVE LOGITS
ripped
0.07
worthy
0.06
_almost
0.06
\"><
0.06
@"\
0.06
chtě
0.06
').'</
0.06
/tr
0.06
yelling
0.06
visibility
0.06
Activations Density 0.019%