INDEX

Explanations

events that evoke strong reactions or significant actions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

tics

-0.07

 Äĩe

-0.07

istrovstvÃŃ

-0.07

IIIK

-0.07

ãģłãģ£ãģ¦

-0.07

 gonna

-0.07

_cmos

-0.07

 olmadan

-0.07

asaki

-0.07

atÃŃm

-0.07

POSITIVE LOGITS

 earlier

0.09

 became

0.08

 gave

0.08

 took

0.07

 elsewhere

0.07

 failed

0.07

saw

0.06

Earlier

0.06

 leave

0.06

 began

0.06

Activations Density 0.055%