INDEX

Explanations

bullet points or introductions

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 sinusoid

0.23

 heuristics

0.21

 volatiles

0.21

 tradeoffs

0.21

 bezier

0.21

 analogs

0.20

 hyperparameters

0.20

،

0.20

🧖

0.20

 embeddings

0.20

POSITIVE LOGITS

Not

0.40

 They

0.39

 Only

0.36

All

0.36

 Does

0.36

 When

0.35

 Have

0.35

 Many

0.35

 Which

0.35

 That

0.35

Activations Density 0.680%