INDEX

Explanations

references to violence and its implications

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

nga

-0.09

ossible

-0.08

dlg

-0.08

Invariant

-0.08

ubit

-0.08

loha

-0.08

ibbon

-0.08

maze

-0.08

rolled

-0.08

ÅĻiv

-0.08

POSITIVE LOGITS

iveness

0.08

uous

0.07

ous

0.07

ometown

0.07

ive

0.07

0.06

/or

0.06

/security

0.06

Activations Density 0.015%