INDEX
Explanations
The neuron fires on numeric tokens—especially signed or decimal numbers—i.e. it detects number‐like entries in the text.
New Auto-Interp
Negative Logits
groß
-0.07
課
-0.07
ίο
-0.06
işte
-0.06
різних
-0.06
γχ
-0.06
nhờ
-0.06
height
-0.06
breakfast
-0.06
testimon
-0.06
POSITIVE LOGITS
trans
0.07
buquerque
0.06
Taco
0.06
delic
0.06
reins
0.06
Crab
0.06
NSMutable
0.06
slur
0.06
invitations
0.06
taco
0.06
Activations Density 0.004%