INDEX

Explanations

proper nouns related to individuals

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

isbury

-0.07

edes

-0.07

è©

-0.07

 Ekon

-0.07

leurs

-0.07

 single

-0.06

arro

-0.06

ISTRIBUT

-0.06

 armed

-0.06

UnderTest

-0.06

POSITIVE LOGITS

/File

0.07

apis

0.07

aida

0.06

kel

0.06

zer

0.06

Ì£

0.06

ungen

0.06

 (*((

0.06

AP

0.06

 safeg

0.06

Activations Density 0.002%