INDEX
Explanations
The neuron consistently fires on numeric tokens—especially those representing decimal‐point values.
New Auto-Interp
Negative Logits
.en
-0.06
wit
-0.06
σαν
-0.06
Canvas
-0.06
služeb
-0.06
什
-0.06
ferm
-0.06
(proc
-0.06
manifests
-0.06
/animate
-0.06
POSITIVE LOGITS
('__0.06
ook
0.06
sacks
0.06
tales
0.06
:↵↵
0.06
Hook
0.06
Remove
0.06
skyline
0.06
_reports
0.06
Crow
0.06
Activations Density 0.028%