INDEX

Explanations

confined to following instructions

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.10

Gab

-0.10

ê¸°ëıĦ

-0.09

 Ð³ÑĥÐ±

-0.09

aub

-0.09

ancel

-0.08

aben

-0.08

è

-0.08

ishi

-0.08

Tab

-0.08

POSITIVE LOGITS

 limited

0.44

 restricted

0.37

limited

0.36

 Stick

0.33

 stick

0.33

 confined

0.33

éĻĲå®ļ

0.32

 limit

0.32

 Limited

0.30

Stick

0.30

Activations Density 0.223%