INDEX
Explanations
punctuation
The neuron selectively activates on subword tokens that include digits (or digit‐letter mixes), such as numbers, percentages, and chemical formulas.
New Auto-Interp
Negative Logits
14
-0.09
22
-0.08
0
-0.08
soc
-0.07
574
-0.07
bus
-0.07
13
-0.07
ize
-0.07
-0.07
rig
-0.07
POSITIVE LOGITS
overhe
0.07
greatest
0.07
.nextDouble
0.07
++↵
0.07
双线
0.07
عات
0.07
子の
0.07
같은
0.07
trăm
0.07
дром
0.07
Activations Density 0.233%