INDEX
Explanations
comparisons
This neuron activates on numeric tokens, especially floating‐point literals (e.g., decimal numbers) in the text.
New Auto-Interp
Negative Logits
Hệ
-0.07
628
-0.07
يلاد
-0.07
суду
-0.07
Saul
-0.07
GINE
-0.07
Jung
-0.07
.ut
-0.06
Wash
-0.06
ranger
-0.06
POSITIVE LOGITS
ruise
0.06
_traits
0.06
#"
0.06
issen
0.06
deepen
0.06
.Interface
0.06
_STATE
0.06
sort
0.05
lv
0.05
vanized
0.05
Activations Density 0.024%