INDEX

Explanations

works well

np_acts-logits-general · gemini-2.5-flash-lite

The neuron activates specifically on the adverb “well” when it’s used to describe how something works, fits, or contrasts.

oai_token-act-pair · o4-mini Triggered by @jyhe0408

the word "well" when it appears in phrases describing how things function, fit, or work together.

oai_token-act-pair · claude-4-5-sonnet Triggered by @jyhe0408

phrases expressing positive evaluations of performance, fit/compatibility, or attractive appearance.

oai_token-act-pair · gpt-5 Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-2-12b-pt/resid_post/layer_24_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

≣

0.79

ای

0.79

ي

0.78

ди

0.78

МА

0.77

LEVEL

0.77

انا

0.77

רא

0.75

رس

0.74

POSITIVE LOGITS

currentColor

0.83

̣ng

0.76

্যায়

0.75

 graag

0.74

groomed

0.74

 Absatz

0.70

esteem

0.69

onds

0.68

unteer

0.68

 gosto

0.68

Activations Density 0.020%

works well

The neuron activates specifically on the adverb “well” when it’s used to describe how something works, fits, or contrasts.

the word "well" when it appears in phrases describing how things function, fit, or work together.

phrases expressing positive evaluations of performance, fit/compatibility, or attractive appearance.

No Comments

No Known Activations

works well

The neuron activates specifically on the adverb “well” when it’s used to describe how something works, fits, or contrasts.

the word "well" when it appears in phrases describing how things function, fit, or work together.

phrases expressing positive evaluations of performance, fit/compatibility, or attractive appearance.

No Comments

No Known Activations