INDEX

Explanations

say "the ending -ic, -otic, -ational, or -ations"

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ве

-0.08

UNC

-0.07

[length

-0.07

轮

-0.07

.What

-0.07

$header

-0.07

())/

-0.07

LOOD

-0.07

POSITIVE LOGITS

 jurors

0.07

amb

0.07

 pozostał

0.07

BCM

0.07

 officer

0.07

acs

0.07

pcl

0.06

专职

0.06

קבל

0.06

ירוש

0.06

Activations Density 0.339%

say "the ending -ic, -otic, -ational, or -ations"

No Comments

No Known Activations

say "the ending -ic, -otic, -ational, or -ations"

No Comments

No Known Activations