INDEX

Explanations

parts of words, often followed by punctuation

np_acts-logits-general · gemini-2.5-flash-lite

This neuron responds strongly to longer, relatively uncommon multisyllabic words.

oai_token-act-pair · o4-mini Triggered by @jyhe0408

words or phrases indicating negative consequences, problems, or difficulties.

oai_token-act-pair · claude-4-5-sonnet Triggered by @jyhe0408

discourse connectives that signal contrast, causation, or shifts/transitions in the flow of events or arguments.

oai_token-act-pair · gpt-5 Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-2-12b-pt/resid_post/layer_24_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 printr

0.76

 定义

0.76

េ

0.76

 публі

0.72

 Sebuah

0.69

 депута

0.69

 carbonates

0.68

 તેને

0.67

 incomes

0.67

 створи

0.67

POSITIVE LOGITS

0.82

 somente

0.67

0.65

 stets

0.61

あくまで

0.61

./

0.61

अधिकांश

0.61

 सदैव

0.60

!!

0.60

됩니다

0.59

Activations Density 0.085%

parts of words, often followed by punctuation

This neuron responds strongly to longer, relatively uncommon multisyllabic words.

words or phrases indicating negative consequences, problems, or difficulties.

discourse connectives that signal contrast, causation, or shifts/transitions in the flow of events or arguments.

No Comments

No Known Activations

parts of words, often followed by punctuation

This neuron responds strongly to longer, relatively uncommon multisyllabic words.

words or phrases indicating negative consequences, problems, or difficulties.

discourse connectives that signal contrast, causation, or shifts/transitions in the flow of events or arguments.

No Comments

No Known Activations