INDEX
Explanations
The neuron primarily activates on numeric literals—especially decimal numbers—within the text.
New Auto-Interp
Negative Logits
"
-0.06
nearing
-0.06
wur
-0.06
fj
-0.06
Raider
-0.06
.','
-0.06
ющ
-0.06
tragic
-0.06
verige
-0.06
obj
-0.06
POSITIVE LOGITS
्ज
0.07
_iterator
0.06
Müş
0.06
allowing
0.06
aver
0.06
(let
0.06
老师
0.06
vět
0.06
hopeless
0.06
onenumber
0.06
Activations Density 0.426%