INDEX
Explanations
comparisons
This neuron activates on numeric tokens—especially decimal numbers and ratings—highlighting measurements or metrics in the text.
New Auto-Interp
Negative Logits
uch
-0.06
kou
-0.06
giảng
-0.06
onz
-0.06
<div
-0.06
.POS
-0.06
exped
-0.06
)});↵
-0.06
Christopher
-0.06
}");↵
-0.06
POSITIVE LOGITS
ตรว
0.07
μές
0.06
_Length
0.06
παν
0.06
Crowley
0.06
超
0.06
에는
0.06
iley
0.06
�
0.06
outright
0.06
Activations Density 0.055%