INDEX
Explanations
conversational prompts
This neuron fires on instructional or procedural language—particularly step-by-step directions phrased with action verbs (e.g. “draw,” “ask,” “have,” “write,” etc.).
New Auto-Interp
Negative Logits
assuming
-0.07
٬
-0.07
uned
-0.07
攝
-0.06
_PROPERTY
-0.06
تیم
-0.06
commissioned
-0.06
rec
-0.06
thinner
-0.06
OBS
-0.06
POSITIVE LOGITS
Calif
0.07
willen
0.06
yıldır
0.06
elu
0.06
izada
0.06
Доб
0.06
aucun
0.06
zeigen
0.06
relation
0.06
จำก
0.06
Activations Density 0.131%