Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

vertex

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

öst

-0.09

 أه

-0.09

 ఉద్యోగ

-0.08

ulich

-0.08

.expression

-0.08

 Schau

-0.08

ahl

-0.08

 شر

-0.08

 Qualifications

-0.08

 geeignet

-0.07

POSITIVE LOGITS

 gemeinsame

0.11

common

0.11

 common

0.10

	common

0.10

 shared

0.10

Shared

0.10

Common

0.09

(common

0.09

shared

0.09

 gemeinsamen

0.09

Activations Density 0.024%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact