INDEX

Explanations

The neuron fires on mentions of “risk” (especially in quantified or “risk of” contexts).

oai_token-act-pair · o4-mini Triggered by @jyhe0408

the phrase "risk of" in contexts discussing potential dangers or hazards.

oai_token-act-pair · claude-4-5-sonnet Triggered by @jyhe0408

mentions of risk or danger, especially constructions discussing the “risk of” something and efforts to reduce or manage that risk.

oai_token-act-pair · gpt-5 Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-2-12b-pt/resid_post/layer_24_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 impervious

1.00

কারের

0.98

 Shelter

0.98

 withstand

0.95

 şart

0.95

WCHAR

0.95

 negatively

0.94

 ochr

0.94

eas

0.93

 PROTECTION

0.92

POSITIVE LOGITS

ל

1.24

фи

1.05

ש

1.05

风险

1.03

ח

1.01

Risk

1.00

री

0.99

 Risk

0.98

我

0.97

ги

0.97

Activations Density 0.100%

The neuron fires on mentions of “risk” (especially in quantified or “risk of” contexts).

the phrase "risk of" in contexts discussing potential dangers or hazards.

mentions of risk or danger, especially constructions discussing the “risk of” something and efforts to reduce or manage that risk.

No Comments

No Known Activations

The neuron fires on mentions of “risk” (especially in quantified or “risk of” contexts).

the phrase "risk of" in contexts discussing potential dangers or hazards.

mentions of risk or danger, especially constructions discussing the “risk of” something and efforts to reduce or manage that risk.

No Comments

No Known Activations