INDEX

Explanations

starting phrase completions

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

.Formatter

-0.16

¶Į

-0.15

¦æĥħ

-0.14

ÂĢÂĢ

-0.13

įng

-0.13

 -*-č\n

-0.12

******č\n

-0.12

łéĻ¤

-0.11

ıa

-0.11

.Dictionary

-0.11

POSITIVE LOGITS

 )\n\n\n\n\n\n\n\n

0.09

 Cher

0.08

."\n\n\n\n

0.08

 basically

0.08

 &#8203;&#8203;

0.08

âĢ¢

0.08

0.07

's

0.07

'(

0.07

Activations Density 0.003%