INDEX

Explanations

resolve disputes

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 discredit

-0.09

 outspoken

-0.09

 backlash

-0.08

enek

-0.08

çģ½

-0.08

apon

-0.08

kke

-0.08

utter

-0.08

 fingert

-0.08

 alliances

-0.08

POSITIVE LOGITS

 dispute

0.31

 disputes

0.30

 differences

0.27

äºī

0.24

 conflict

0.24

 conflicts

0.21

 Differences

0.21

 issues

0.20

 tranh

0.20

 difference

0.19

Activations Density 0.092%