OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
legal-style discussions of discrimination and protected grounds, especially definitions or analyses of unfair treatment based on identity characteristics
gpt-5
against a specific group based on protected grounds (such as
expressions that describe spatial containment or discreet placement, such as something being tucked or positioned within, behind, or between other objects.
discussions about systemic gender and representation bias—especially in STEM, technology, and research design—highlighting male-centered systems that disadvantage women and advocating for greater diversity and inclusion.
This neuron detects mentions of race and racial-group topics, especially content about racial identity, discrimination, representation, or related controversies.
gpt-5-mini
receives criticism, while changing a character from white to black
mentions of academic institutions and affiliations, especially “University of …” names, research labs, and spin‑off attributions after people or companies
gpt-5
University of Singapore (now National University of Singapore) and