INDEX

Explanations

legal and court related

np_max-act-logits · gemini-2.0-flash Triggered by @majianfei030706

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_23/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 tackling

-0.07

賦

-0.06

Elements

-0.06

Comput

-0.06

capitalize

-0.06

Published

-0.06

Whether

-0.06

Leon

-0.06

发布时间

-0.06

美国总统

-0.06

POSITIVE LOGITS

=result

0.07

 square

0.07

 FORMAT

0.07

 userManager

0.07

普通的

0.07

upakan

0.07

 EQUI

0.06

_observer

0.06

 Median

0.06

 Mode

0.06

Activations Density 0.015%

legal and court related

No Comments

No Known Activations

legal and court related

No Comments

No Known Activations