INDEX

Explanations

phrases indicating reasons or justifications

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

inecraft

-0.07

yang

-0.07

-dot

-0.07

Interop

-0.07

ÑıÑģ

-0.06

onth

-0.06

/open

-0.06

means

-0.06

_DX

-0.06

merce

-0.06

POSITIVE LOGITS

why

0.11

why

0.08

 needing

0.07

Why

0.07

Why

0.07

 being

0.07

 success

0.07

WHY

0.07

ä¸ºä»Ģä¹Ī

0.07

not

0.06

Activations Density 0.011%