INDEX

Explanations

references to AI assistants and large language models, especially self-referential descriptions of the model, tools, and benchmarks (often with dates or platform names)

The neuron primarily detects numeric tokens (digits, numerals and year-like numbers) in the text.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

мето

0.52

сле

0.48

 needlessly

0.47

counted

0.46

 традиции

0.45

народ

0.45

ENSOR

0.45

कांची

0.45

из

0.44

ದ್ದರಿಂದ

0.43

POSITIVE LOGITS

AI

0.96

 ChatGPT

0.94

 OpenAI

0.92

 chatbot

0.91

GPT

0.84

 conversational

0.80

 chatbots

0.76

 openai

0.76

openai

0.73

chatbot

0.71

Activations Density 1.397%