INDEX

Explanations

provide more context

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 blurred

-0.09

createUrl

-0.09

olo

-0.09

Erd

-0.09

 Guerrero

-0.08

 RESERVED

-0.08

amba

-0.08

lasses

-0.08

 pornos

-0.08

POSITIVE LOGITS

 context

0.22

context

0.16

 Context

0.15

 contexto

0.15

.context

0.14

Context

0.14

\tcontext

0.13

(context

0.13

 more

0.13

_CONTEXT

0.13

Activations Density 0.055%