INDEX

Explanations

conditional phrases indicating desire or intent

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

zal

-0.09

attern

-0.08

iming

-0.07

asio

-0.07

 werde

-0.07

 will

-0.07

jerne

-0.07

awy

-0.07

 wÃ¼rde

-0.07

Ã³c

-0.07

POSITIVE LOGITS

 like

0.13

like

0.10

 likes

0.10

 prefer

0.09

 rather

0.09

 Like

0.09

å¦Ĥ

0.08

Like

0.08

_like

0.08

 LIKE

0.08

Activations Density 0.005%