INDEX
Explanations
The neuron selectively activates on floating‐point numeric tokens (i.e. decimal numbers).
New Auto-Interp
Negative Logits
ющ
-0.06
-money
-0.06
dime
-0.06
ancor
-0.06
ُو
-0.06
neau
-0.06
Rad
-0.06
outr
-0.06
wohl
-0.06
งของ
-0.06
POSITIVE LOGITS
hend
0.07
gradient
0.07
unto
0.06
матері
0.06
MP
0.06
banks
0.06
ctr
0.06
low
0.06
_ctx
0.06
trl
0.06
Activations Density 0.001%