INDEX

Explanations

references to addiction and withdrawal experiences

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

fsp

-0.07

 vess

-0.07

ÃŃÅĻ

-0.07

ãĥ©ãĥ¼

-0.07

 trang

-0.07

 Ð´Ð¾ÑĤ

-0.07

iple

-0.07

ndl

-0.07

inou

-0.07

dÃ¡l

-0.06

POSITIVE LOGITS

 withdrawal

0.20

 Withdraw

0.17

 withdrawals

0.17

withdraw

0.16

 withdraw

0.16

Withdraw

0.15

 detox

0.14

 withdrawing

0.13

 taper

0.12

Det

0.12

Activations Density 0.009%