INDEX

Explanations

phrases indicating trust and assurance in service or capabilities

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

inen

-0.07

 inval

-0.07

ÐķÐ¡

-0.07

èľľ

-0.06

yo

-0.06

edo

-0.06

esar

-0.06

charm

-0.06

ander

-0.06

edom

-0.06

POSITIVE LOGITS

leave

0.07

-safe

0.06

arges

0.06

hdl

0.06

 capable

0.06

orry

0.06

axy

0.06

 safe

0.06

 safer

0.06

?(:

0.06

Activations Density 0.061%