INDEX
Explanations
The neuron activates on numeric tokens with decimal points (floating-point numbers).
New Auto-Interp
Negative Logits
구글상위
-0.07
мик
-0.07
版
-0.07
생활
-0.07
bulk
-0.06
thích
-0.06
Positive
-0.06
:border
-0.06
구글상위
-0.06
pursuits
-0.06
POSITIVE LOGITS
ด
0.07
\/
0.07
MSI
0.06
buddy
0.06
vznik
0.06
Analy
0.06
".");↵
0.06
Ln
0.06
Elizabeth
0.06
equ
0.06
Activations Density 0.007%