INDEX

Explanations

`<|text start|>` or `<start>`

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 bare

-0.09

 ***!\n

-0.09

/misc

-0.08

 unus

-0.08

âĶĺ

-0.08

 bait

-0.08

Ãºt

-0.08

sur

-0.08

 worm

-0.08

lox

-0.08

POSITIVE LOGITS

><

0.10

wes

0.09

ACKET

0.09

 Cham

0.09

 hybrid

0.09

Wes

0.09

 Blond

0.09

 Hybrid

0.08

åµ

0.08

ored

0.08

Activations Density 0.013%