INDEX
Explanations
citations
This neuron responds to numeric tokens (numbers such as page/volume figures and years).
New Auto-Interp
Negative Logits
.ui
-0.07
rozum
-0.07
kings
-0.07
_I
-0.07
.useState
-0.07
.c
-0.07
guidelines
-0.07
ruling
-0.07
trivia
-0.07
implied
-0.07
POSITIVE LOGITS
30
0.10
170
0.10
20
0.10
40
0.10
110
0.09
80
0.09
50
0.09
590
0.09
70
0.09
580
0.08
Activations Density 0.298%