INDEX

Explanations

words or tokens related to programming, technical terms, or conversational roles within code or instruction-like contexts.

oai_token-act-pair · gemini-2.5-flash Triggered by @xiaoqingsun004

instructions about how to analyze, process, or structure responses to user queries.

oai_token-act-pair · claude-4-5-haiku Triggered by @xiaoqingsun004

chat-style conversation scaffolding, especially role markers, prompt/instruction meta text, and assistant reply boilerplate within multi-turn dialogues

oai_token-act-pair · gpt-5 Triggered by @jdhshshs138

references to specific test strings or identifiers (particularly "davidjl") being analyzed or manipulated in conversational exchanges.

oai_token-act-pair · claude-4-5-sonnet Triggered by @jdhshshs138

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

iap

-0.08

atient

-0.07

😳

-0.07

eng

-0.07

事を

-0.07

正是因为

-0.07

Concat

-0.07

jt

-0.07

ماذا

-0.07

.LayoutControlItem

-0.07

POSITIVE LOGITS

 giải

0.08

_of

0.08

_expr

0.08

 продук

0.07

submission

0.07

_SPLIT

0.07

老婆

0.07

.div

0.07

ся

0.07

 пара

0.07

Activations Density 42.091%

words or tokens related to programming, technical terms, or conversational roles within code or instruction-like contexts.

instructions about how to analyze, process, or structure responses to user queries.

chat-style conversation scaffolding, especially role markers, prompt/instruction meta text, and assistant reply boilerplate within multi-turn dialogues

references to specific test strings or identifiers (particularly "davidjl") being analyzed or manipulated in conversational exchanges.

No Comments

No Known Activations