INDEX
Explanations
articles
This neuron consistently activates on numeric tokens—especially floating-point numbers—rather than on ordinary words.
New Auto-Interp
Negative Logits
where
-0.07
where
-0.07
crossings
-0.07
follows
-0.07
.A
-0.07
سبز
-0.06
frame
-0.06
included
-0.06
crossing
-0.06
labeled
-0.06
POSITIVE LOGITS
lief
0.07
환
0.06
ListNode
0.06
arring
0.06
PerPixel
0.06
_pen
0.06
tsx
0.06
memiş
0.06
uninsured
0.06
漏
0.06
Activations Density 0.018%