INDEX
Explanations
This neuron activates on numeric tokens, particularly decimal numbers (floating‐point measurements) in the text.
New Auto-Interp
Negative Logits
toolbox
-0.07
esta
-0.06
800
-0.06
@if
-0.06
847
-0.06
Roger
-0.06
Distribution
-0.06
grades
-0.06
777
-0.06
wird
-0.06
POSITIVE LOGITS
主義
0.07
@testable
0.07
PRIMARY
0.07
.navCtrl
0.06
jspb
0.06
cre
0.06
(Unknown
0.06
初始化
0.06
{
↵0.06
volupt
0.06
Activations Density 0.145%