INDEX

Explanations

different or same

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

imos

-0.09

 LATIN

-0.09

 konkrÃ©t

-0.09

ÑģÑĮ

-0.09

 spreads

-0.09

rippling

-0.08

heim

-0.08

teh

-0.08

 imitation

-0.08

criptor

-0.08

POSITIVE LOGITS

 different

0.16

ä¸įåĲĮ

0.16

 khÃ¡c

0.14

ä¸įåĲĮçļĦ

0.13

 farklÄ±

0.12

different

0.12

 same

0.11

_different

0.11

 ÑĢÐ°Ð·Ð½ÑĭÑħ

0.11

 responses

0.11

Activations Density 0.044%