About Neuronpedia
Goals
Neuronpedia's first goal is to increase AI safety and understanding. It is a collaborative effort to explain and understand modern AI models in order to make them safer and more predictable.
Neuronpedia's second goal is to increase public participation, education, and awareness of AI safety. By building a game that anyone can play, Neuronpedia makes AI and AI safety approachable without any prior technical knowledge.
What Neuronpedia Is
Neuronpedia is two parts: one part game, the other part reference.
First, Neuronpedia is a game that lets anyone contribute their cleverness to help understand AI models, without needing any technical knowledge. Players earn points as they play the game, and can be ranked globally and compete for the highest rankings. You can think of it as "Geoguessr for AI neurons".
Second, Neuronpedia is also a "Wikipedia for AI Neurons". The data (explanations, votes) that is generated by the game is stored as a reference for each AI neuron, with the best explanations for each neuron surfacing to the top. By analyzing the top explanations and activations, researchers can better understand AI models.
Committed to Open Data and Transparency
All data sourced by Neuronpedia is freely available for research purposes. You can browse all data here, or if you want a specific data export or format, just ask.
Acknowledgements
Neuronpedia uses and is grateful for these data and tools:
- Sparse Coding + Directions by Hoagy Cunningham, Logan Riggs, and Aidan Ewart
- Automated Interpretability by OpenAI
- Neuroscope by Neel Nanda
- TransformerLens by Neel Nanda
Neuronpedia is built and operated in San Francisco by Johnny Lin. William Saunders provided informal advice and guidance. Neuronpedia is supported by a short-term grant from a donor, and a regrant from Austin Chen (through Manifund). Let us know if you're interested in donating to Neuronpedia.