Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

prompt

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 mind

-0.09

 founded

-0.09

 coined

-0.08

 daim

-0.08

 smooth

-0.08

 Founded

-0.07

 blog

-0.07

 mixed

-0.07

 fans

-0.07

 according

-0.07

POSITIVE LOGITS

_TEMPLATE

0.09

 Vorlage

0.09

 Rece

0.09

 Rochelle

0.09

	template

0.08

お願い

0.08

模板

0.08

<|start|>

0.08

 beoord

0.08

			
↵			
↵

0.08

Activations Density 0.007%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact