INDEX

Explanations

preceding punctuation

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Because

-0.08

 tá»«ng

-0.08

¶Į

-0.08

ãĥ»ãĤ¢

-0.08

ä»ĸãģ®

-0.08

 Screw

-0.08

dub

-0.08

mbH

-0.08

beck

-0.08

ÑĥÑĢÐ½

-0.08

POSITIVE LOGITS

one

0.16

 Specifically

0.12

 specifically

0.12

 There

0.12

 there

0.11

One

0.11

There

0.11

æľīä¸Ģ

0.10

Fam

0.10

The

0.10

Activations Density 0.248%