INDEX

Explanations

with others

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Families

-0.10

æīĢæľī

-0.10

 Starr

-0.10

 Bris

-0.09

 outsider

-0.09

guy

-0.09

REA

-0.09

ãģĤãĤĭ

-0.08

aled

-0.08

 Kitt

-0.08

POSITIVE LOGITS

 others

0.53

others

0.42

Others

0.39

 Others

0.38

 anderen

0.27

 Ø¯ÛĮÚ¯Ø±Ø§ÙĨ

0.24

 other

0.23

 otros

0.21

 baÅŁk

0.20

 altri

0.20

Activations Density 0.051%