INDEX
Explanations
shifting narratives
The neuron activates on tokens that refer to narrative viewpoint shifts—words like “alternating,” “perspectives,” “shifts,” “POV,” and “narratives.”
New Auto-Interp
Negative Logits
steer
-0.07
eon
-0.07
баж
-0.07
н
-0.06
(floor
-0.06
timestep
-0.06
预
-0.06
spotlight
-0.06
banks
-0.06
@property
-0.06
POSITIVE LOGITS
===============↵
0.07
sterile
0.07
_CI
0.06
_ar
0.06
renk
0.06
nominal
0.06
iciones
0.06
siempre
0.06
indirectly
0.06
_Main
0.06
Activations Density 0.046%