INDEX

Explanations

internships and education

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 induction

-0.08

 disorder

-0.07

 entonces

-0.07

一如既往

-0.07

Unhandled

-0.07

دا

-0.06

aligned

-0.06

.assignment

-0.06

想到了

-0.06

 mejor

-0.06

POSITIVE LOGITS

 Irving

0.07

 ViewModel

0.07

 STEM

0.07

泡泡

0.07

商

0.07

ترت

0.07

irim

0.07

Design

0.07

뜁

0.06

 вели

0.06

Activations Density 0.089%

internships and education

No Comments

No Known Activations

internships and education

No Comments

No Known Activations