INDEX

Explanations

many followed by time or description

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ney

-0.10

ilon

-0.10

halb

-0.09

arges

-0.09

iller

-0.09

cs

-0.09

ses

-0.09

 continents

-0.09

 Wayback

-0.08

iti

-0.08

POSITIVE LOGITS

ToMany

0.23

fold

0.22

-many

0.21

 different

0.19

-sided

0.18

yyy

0.16

/all

0.16

different

0.14

yyyy

0.14

atta

0.14

Activations Density 0.049%