INDEX

Explanations

identities and states of being

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

éĥ½ä¼ļ

-0.09

Iso

-0.09

alnÄ±z

-0.09

udem

-0.09

IDES

-0.09

-0.08

avy

-0.08

quist

-0.08

 Goldberg

-0.08

à¤¿à¤¶à¤¤

-0.08

POSITIVE LOGITS

ewe

0.09

 Ã¤r

0.09

jot

0.08

eam

0.08

ABCDEFGHIJKLMNOP

0.08

ABCDEFGHI

0.08

mons

0.08

 sommes

0.08

xing

0.08

wich

0.08

Activations Density 0.222%