Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Code, file paths

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

rott

-0.09

 interpersonal

-0.09

 city

-0.09

 SWOT

-0.08

.Person

-0.08

 offent

-0.08

意

-0.08

 शहर

-0.08

 город

-0.08

 organizational

-0.08

POSITIVE LOGITS

Native

0.11

 Native

0.09

 libc

0.09

ffi

0.09

native

0.09

 raspberry

0.09

 veloc

0.09

.tensor

0.08

Whe

0.08

Tensor

0.08

Activations Density 0.012%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact