INDEX

Explanations

phrases indicating capability or potential action

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ILINE

-0.06

alam

-0.06

 patron

-0.05

chein

-0.05

 Liberation

-0.05

 know

-0.05

 setId

-0.05

 ==============================================================

-0.05

 directly

-0.05

 Caul

-0.05

POSITIVE LOGITS

0.08

adel

0.07

 handle

0.07

opi

0.07

ade

0.07

handle

0.07

ocale

0.07

 udrÅ¾

0.07

æĪĲåĬŁ

0.07

HING

0.07

Activations Density 0.033%