INDEX

Explanations

builds on existing concepts

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Majority

0.43

 reprinted

0.40

 majority

0.39

majority

0.39

 Background

0.39

 Consciousness

0.38

 Underlying

0.38

 History

0.38

 Origin

0.37

*}

0.37

POSITIVE LOGITS

 themes

0.65

themes

0.59

 темы

0.55

concepts

0.55

 concepts

0.53

 conceptos

0.53

lessons

0.52

 lessons

0.50

years

0.48

 successes

0.48

Activations Density 0.030%