INDEX

Explanations

ensure fairness and order

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Codec

-0.10

 vidÃ©

-0.09

Ish

-0.09

 Boeh

-0.09

 Kurd

-0.08

 discre

-0.08

 Military

-0.08

RNG

-0.08

POSITIVE LOGITS

fair

0.19

 fair

0.18

 transparent

0.15

 Fair

0.15

 fairness

0.15

Fair

0.14

 confidence

0.14

 deter

0.14

 predict

0.13

 Confidence

0.13

Activations Density 0.078%