INDEX
Explanations
numbers following words like 'list of' or 'over'
This neuron detects numeric tokens — numbers and numerical quantities (statistics, measurements, years, etc.) in the text.
New Auto-Interp
Negative Logits
愎
0.29
躹
0.26
zdjęcie
0.25
ttino
0.25
þat
0.25
dítě
0.24
څر
0.24
애
0.24
단순히
0.24
instantiation
0.24
POSITIVE LOGITS
5
0.51
3
0.49
7
0.49
4
0.49
8
0.47
6
0.47
six
0.43
9
0.42
2
0.41
1
0.40
Activations Density 0.268%