INDEX
Explanations
"about me" introductions
The neuron activates on the assistant’s self-identification phrases—particularly when it names itself (e.g. “Vicuna”) in its introductory responses.
New Auto-Interp
Negative Logits
predicate
-0.06
intimate
-0.06
Jer
-0.06
(candidate
-0.06
lament
-0.06
rior
-0.06
Article
-0.06
��
-0.06
ROLL
-0.06
gre
-0.06
POSITIVE LOGITS
.Unmarshal
0.06
arreglo
0.06
}_
0.06
Blackhawks
0.06
sao
0.06
Fla
0.06
zvlášt
0.06
.methods
0.06
HDR
0.06
GRID
0.06
Activations Density 0.008%