INDEX

Explanations

the word "plan" and its derivatives

oai_token-act-pair · gemini-2.0-flash

New Auto-Interp

Configuration

fnlp/Llama-Scope-R1-Distill/400M-Slimpajama-400M-OpenR1-Math-220k/L21R

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Hzfinfdu/SlimPajama-3B and open-r1/OpenR1-Math-220k

Features

32,768

Data Type

float32

Hook Name

blocks.21.hook_resid_post

Architecture

jumprelu

Context Size

1,024

Dataset

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

gang

-0.08

.gdx

-0.08

gun

-0.07

ibaba

-0.07

sov

-0.07

Å¼e

-0.07

alars

-0.07

unner

-0.07

 láº½

-0.07

dÃ¼r

-0.07

POSITIVE LOGITS

etary

0.11

isphere

0.10

egg

0.10

ter

0.10

(plan

0.09

 ning

0.09

er

0.09

-plan

0.08

-ahead

0.08

 plan

0.08

Activations Density 0.013%

the word "plan" and its derivatives

No Comments

No Known Activations