INDEX

Explanations

Turning lights on/off

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_15/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.15.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 daha

-0.08

绿豆

-0.07

并不是很

-0.07

veyor

-0.07

ennie

-0.07

新征程

-0.07

 Politico

-0.07

 невозможно

-0.07

 sık

-0.07

 اليمن

-0.07

POSITIVE LOGITS

 safety

0.07

 planted

0.07

 Abraham

0.07

/modal

0.06

.vars

0.06

AR

0.06

ação

0.06

 Protestant

0.06

 accom

0.06

_adapter

0.06

Activations Density 0.034%

Turning lights on/off

No Comments

No Known Activations