INDEX

Explanations

references to actions and their impacts

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ÑģÑĤÑĢÐ¾

-0.08

stro

-0.07

 ÑģÑĤÐ°Ð½Ð¾Ð²Ð¸ÑĤÑĮ

-0.07

ideographic

-0.07

ToBounds

-0.07

.Surface

-0.07

siz

-0.07

istrovstvÃŃ

-0.07

 thá»į

-0.07

statuses

-0.07

POSITIVE LOGITS

/actions

0.10

 actions

0.10

actions

0.08

inic

0.08

acts

0.08

-actions

0.08

 towards

0.07

 action

0.07

 acts

0.07

-action

0.07

Activations Density 0.018%