INDEX

Explanations

fiction theory

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.10

 empir

-0.10

adero

-0.09

_topics

-0.09

illa

-0.09

 pragmatic

-0.08

irÃ¡

-0.08

 Whisper

-0.08

(HWND

-0.08

POSITIVE LOGITS

 theory

0.41

çĲĨè®º

0.38

 theoretical

0.37

 abstract

0.35

 ÑĤÐµÐ¾ÑĢ

0.35

theory

0.33

 Theory

0.32

 theoret

0.32

 THEORY

0.32

Theory

0.29

Activations Density 0.219%