INDEX

Explanations

asking questions and information

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

_UNDEF

-0.09

deer

-0.09

Ĩµ

-0.09

ÃĹ\n\n

-0.09

veral

-0.08

oÄŁ

-0.08

PasswordEncoder

-0.08

_Lean

-0.08

_Tis

-0.08

usher

-0.08

POSITIVE LOGITS

ap

0.09

Cli

0.09

 likes

0.09

ex

0.08

new

0.08

esan

0.08

or

0.08

((-

0.08

 sice

0.08

Activations Density 0.122%