INDEX

Explanations

instances of negative outcomes or consequences

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

agan

-0.07

_DISABLE

-0.07

ilion

-0.07

 nominal

-0.07

á»©ng

-0.07

ucci

-0.07

izr

-0.07

hop

-0.06

ventus

-0.06

itter

-0.06

POSITIVE LOGITS

 Comments

0.07

ufen

0.07

 comments

0.07

0.06

aches

0.06

eln

0.06

ampo

0.06

Tun

0.06

ETCH

0.06

 holidays

0.06

Activations Density 0.001%