INDEX

Explanations

questions, thanks, and informal remarks

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

💀

1.18

:/

1.13

xD

1.08

 fucking

1.02

✨

1.01

 fucked

1.00

idk

0.99

 fuck

0.96

👾

0.95

⚠️

0.93

POSITIVE LOGITS

 Incidentally

1.21

Incidentally

1.16

 Guess

1.08

Wouldn

1.04

 incidentally

0.97

 guess

0.96

Gee

0.96

 Believe

0.95

Guess

0.94

 terrific

0.93

Activations Density 0.027%