INDEX

Explanations

instances of cautionary language or warnings

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

awn

-0.08

ien

-0.07

alam

-0.07

unter

-0.06

stan

-0.06

Ð±Ð¸ÑĢÐ°

-0.06

ansen

-0.06

_secure

-0.06

anna

-0.06

alte

-0.06

POSITIVE LOGITS

 against

0.14

 about

0.14

against

0.12

 Against

0.11

Against

0.11

 tentang

0.09

about

0.09

_about

0.09

eware

0.09

_again

0.08

Activations Density 0.008%