INDEX

Explanations

references to danger and violence associated with a specific group or agenda

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 transc

-0.07

 Satisfaction

-0.07

 satisfaction

-0.07

 satisf

-0.06

oker

-0.06

zer

-0.06

 transcend

-0.06

ember

-0.06

fect

-0.06

opia

-0.06

POSITIVE LOGITS

TES

0.07

.xhtml

0.07

ernet

0.07

XHR

0.07

gne

0.07

.twig

0.07

 smÃ¥

0.07

 nackte

0.07

ledo

0.07

 POSSIBILITY

0.06

Activations Density 0.011%