INDEX

Explanations

roman numerals II and III

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

isel

-0.10

 Silent

-0.09

ingly

-0.09

 silent

-0.09

ffer

-0.09

plate

-0.09

ï¼Ĺ

-0.09

agine

-0.09

ORMAT

-0.09

isman

-0.08

POSITIVE LOGITS

teen

0.13

ÎĻ

0.12

IB

0.12

ÐĨ

0.11

teenth

0.10

-V

0.10

 Ø¹Ø´Ø±

0.10

IE

0.09

muh

0.09

Ã¨me

0.09

Activations Density 0.044%