INDEX

Explanations

"ux" followed by "u"

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Thrones

-0.09

,None

-0.09

ARB

-0.08

 Preconditions

-0.08

æ³¥

-0.08

Î½Î¹Î±

-0.08

 alertController

-0.07

UNK

-0.07

POSITIVE LOGITS

ptive

0.09

pst

0.08

 &#8203;&#8203;

0.08

Mage

0.08

ëį°ìĿ´íĬ¸

0.08

ecute

0.07

ochen

0.07

 instruments

0.07

mega

0.07

 rico

0.07

Activations Density 0.030%