INDEX
Explanations
The neuron activates specifically on decimal number tokens (floating‐point values) in the text.
New Auto-Interp
Negative Logits
lık
-0.07
smart
-0.07
seeker
-0.06
Imperial
-0.06
tình
-0.06
era
-0.06
inding
-0.06
üt
-0.06
废
-0.06
Moody
-0.06
POSITIVE LOGITS
_zero
0.07
undone
0.06
(after
0.06
*Math
0.06
getChild
0.06
klient
0.06
Друг
0.06
druh
0.06
lbrace
0.06
embedded
0.06
Activations Density 0.022%