INDEX
Explanations
this that
This neuron does not activate for any input and thus does not detect any specific feature.
New Auto-Interp
Negative Logits
particularly
-0.07
Schultz
-0.06
Rud
-0.06
Professor
-0.06
annabin
-0.06
eting
-0.06
van
-0.06
ARR
-0.06
awarded
-0.06
ัง
-0.06
POSITIVE LOGITS
']=
0.07
)}"↵
0.07
/token
0.07
)"},↵
0.07
_stderr
0.06
})↵
0.06
);↵↵
0.06
(ErrorMessage
0.06
)"↵↵
0.06
.insert
0.06
Activations Density 0.045%