INDEX

Explanations

s followed by 'est", "ujet", "'Ã©", "issy"

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 cÃ©lib

-0.10

 poil

-0.10

 nouve

-0.10

 Starr

-0.10

pec

-0.10

 advoc

-0.09

 ChÃŃ

-0.09

inator

-0.09

 eiusmod

-0.09

 Encore

-0.09

POSITIVE LOGITS

0.12

ied

0.11

rie

0.11

rs

0.11

ango

0.10

aper

0.10

sey

0.10

urs

0.10

aine

0.09

otte

0.09

Activations Density 0.022%