INDEX
Explanations
This neuron activates on numeric tokens—especially decimal numbers—within the text.
New Auto-Interp
Negative Logits
]>↵
-0.07
}}>
-0.07
[] ↵
-0.06
ze
-0.06
'y
-0.06
]-->↵
-0.06
++) ↵
-0.06
.ITEM
-0.06
(Gravity
-0.06
Benefit
-0.06
POSITIVE LOGITS
absorbed
0.07
instability
0.07
Winnipeg
0.06
تح
0.06
幻
0.06
Surface
0.06
τι
0.06
-n
0.06
Каб
0.06
trand
0.06
Activations Density 0.001%