Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_15/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.15.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

概

-0.07

 chin

-0.06

_ONCE

-0.06

的前提下

-0.06

件

-0.06

回合

-0.06

Cre

-0.06

场地

-0.06

 Preis

-0.06

	best

-0.06

POSITIVE LOGITS

_tls

0.07

Roles

0.07

 racket

0.07

 teenage

0.07

죗

0.07

玫瑰

0.07

 ;↵

0.07

贸易战

0.07

 declaración

0.07

死亡

0.06

Activations Density 0.006%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact