INDEX

Explanations

references to government control and censorship over communication channels

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ãĥ©ãĤ¹

-0.08

ÙħÙĪ

-0.07

 Ø¯Ø±Ø¨

-0.07

.TextInput

-0.07

 stim

-0.07

_ARCH

-0.07

 trillion

-0.07

ippets

-0.07

aned

-0.06

 scand

-0.06

POSITIVE LOGITS

 block

0.08

:block

0.07

blocks

0.07

 blocked

0.07

krom

0.07

 content

0.06

olle

0.06

 BLOCK

0.06

 censorship

0.06

-block

0.06

Activations Density 0.010%