Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Russian language texts

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

mir

-0.09

boll

-0.08

 Creo

-0.08

atores

-0.08

 disreg

-0.08

 diferents

-0.08

 destruir

-0.08

 marcador

-0.08

Nivel

-0.08

ベル

-0.08

POSITIVE LOGITS

 грамот

0.11

 золот

0.10

 прекрас

0.10

 грам

0.10

 муд

0.10

ю

0.10

 elegant

0.10

 любви

0.10

 рассказ

0.09

 محبت

0.09

Activations Density 0.058%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact