INDEX

Explanations

if... not, otherwise

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

igans

-0.09

plode

-0.09

Yer

-0.08

aklÄ±

-0.08

oard

-0.08

eed

-0.08

 somebody

-0.08

 ---------------------------------------------------------------------------\n

-0.08

fortunately

-0.08

olet

-0.08

POSITIVE LOGITS

not

0.17

rame

0.15

 otherwise

0.12

 Ð¶Ðµ

0.12

fy

0.12

rames

0.11

rit

0.11

 ê·¸ëłĩ

0.11

 Otherwise

0.11

not

0.11

Activations Density 0.021%