INDEX

Explanations

feedback and guidance

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Launcher

0.47

Lateral

0.46

Suggestion

0.44

Bracketing

0.44

Manipulation

0.44

Removal

0.44

 cleats

0.43

TableHeader

0.42

ToUpper

0.42

糈

0.42

POSITIVE LOGITS

are

0.52

ete

0.46

 anzi

0.44

lur

0.42

彼の

0.42

mese

0.40

 garantiza

0.40

 ایپ

0.40

engono

0.39

Activations Density 0.003%