INDEX

Explanations

references to individuals providing testimony or observations

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

asan

-0.09

adesh

-0.09

asar

-0.08

ERIC

-0.08

ady

-0.08

igi

-0.08

aign

-0.08

adiens

-0.08

iminal

-0.07

spa

-0.07

POSITIVE LOGITS

ry

0.09

ess

0.09

ively

0.07

(es

0.07

RY

0.07

dom

0.06

marshal

0.06

à¸¢

0.06

arrant

0.06

es

0.06

Activations Density 0.005%