INDEX

Explanations

the followed by sentence starters

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.10

otr

-0.09

-0.08

::::::::::::

-0.08

.ServiceModel

-0.08

_requested

-0.08

errupted

-0.08

 Cous

-0.08

 Lange

-0.08

ches

-0.08

POSITIVE LOGITS

 song

0.21

 poem

0.17

 sentence

0.13

 dish

0.13

song

0.12

 message

0.12

 recipe

0.12

 piece

0.12

 joke

0.11

 story

0.11

Activations Density 0.151%