OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
snippets of HTML/JavaScript used in cross-site scripting or other client-side injection attacks (e.g., <script>, onerror/onclick attributes, src/import URLs, alert/document.cookie).
tokens that mark chat structure and role/metadata (system/user/assistant markers, start/end boundaries and other formatting/quote/punctuation markers).
Tokens that begin an assistant reply or dialogue turn—especially opening/confirmation phrases like "Of", "Of course," and the leading quote marks that start a response.