INDEX
Explanations
The neuron selectively activates on numeric tokens—especially floating‐point numbers and signed decimals—in the text.
New Auto-Interp
Negative Logits
Sep
-0.08
Describe
-0.07
acesso
-0.07
PFN
-0.07
друга
-0.07
ียรต
-0.07
Court
-0.06
ARTH
-0.06
USC
-0.06
BUILD
-0.06
POSITIVE LOGITS
-
0.09
)—
0.07
"-
0.07
않았다
0.07
-
0.07
'-
0.07
员
0.07
-п
0.07
indemn
0.06
‑
0.06
Activations Density 0.008%