INDEX
Explanations
This neuron specifically fires on numeric tokens (especially decimal numbers) in the text.
New Auto-Interp
Negative Logits
males
-0.06
marketed
-0.06
RN
-0.06
suicide
-0.06
Fax
-0.06
_unsigned
-0.06
boxed
-0.06
ा�
-0.06
�
-0.06
Summary
-0.06
POSITIVE LOGITS
Kee
0.08
VStack
0.07
discriminatory
0.07
نگی
0.07
(notification
0.07
Cly
0.06
Что
0.06
_ground
0.06
نج
0.06
Xi
0.06
Activations Density 0.002%