INDEX
Explanations
instructions or advice
The neuron fires on tokens that appear in step-by-step troubleshooting or instructional sentences, especially the common words and verbs used in numbered how-to guidance.
New Auto-Interp
Negative Logits
projector
-0.08
Constant
-0.07
icles
-0.07
ursors
-0.07
alted
-0.07
(___
-0.06
microscopy
-0.06
nord
-0.06
antically
-0.06
olecular
-0.06
POSITIVE LOGITS
tady
0.07
wind
0.07
trom
0.06
исключ
0.06
archivo
0.06
ocation
0.06
hill
0.06
UIGraphics
0.06
ville
0.06
때
0.06
Activations Density 0.055%