INDEX

Explanations

expressions related to adherence to rules or standards

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

utz

-0.07

anie

-0.07

cls

-0.07

steen

-0.07

ised

-0.07

cole

-0.07

ile

-0.07

uell

-0.07

ediator

-0.07

ceries

-0.06

POSITIVE LOGITS

GGLE

0.08

antly

0.08

ably

0.08

evil

0.07

Î¬Ïĥ

0.07

 rules

0.06

-invalid

0.06

Activations Density 0.002%