INDEX
Explanations
This neuron activates on terms that denote an overall or aggregate measure—words like “whole,” “overall,” “system,” “performance,” or “volume.”
New Auto-Interp
Negative Logits
ấm
-0.08
менее
-0.07
tak
-0.07
Jackson
-0.07
aver
-0.06
Ingredient
-0.06
attention
-0.06
aceutical
-0.06
eş
-0.06
알
-0.06
POSITIVE LOGITS
=p
0.07
:v
0.06
(resources
0.06
(V
0.06
+t
0.06
idea
0.06
,s
0.06
.Write
0.06
ORDER
0.06
.segments
0.06
Activations Density 0.025%