INDEX
Explanations
This neuron detects instruction phrases about completing the user’s request (e.g., “completes the request”).
New Auto-Interp
Negative Logits
landa
-0.06
stances
-0.06
macros
-0.06
pistols
-0.06
玉
-0.06
dbus
-0.06
经验
-0.06
셀
-0.06
ýval
-0.06
(that
-0.06
POSITIVE LOGITS
ODULE
0.07
vulner
0.06
ξει
0.06
pw
0.06
owntown
0.06
konum
0.06
وروب
0.06
downside
0.06
ünd
0.06
.ViewGroup
0.06
Activations Density 0.002%