INDEX

Explanations

words indicating irrational behavior or unpredictability

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

íĥĪ

-0.06

ishing

-0.06

reate

-0.06

ainen

-0.06

akens

-0.06

Vog

-0.06

å¯

-0.06

Ø§Ø¶

-0.06

åł´

-0.06

arium

-0.06

POSITIVE LOGITS

imes

0.07

ovsky

0.07

OMPI

0.07

kB

0.06

ivy

0.06

%[

0.06

 Richie

0.06

ief

0.06

 Telefon

0.06

asto

0.06

Activations Density 0.000%