INDEX

Explanations

forest floor, extra features, light bulb, leading vehicle

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ADES

0.37

opropane

0.37

бні

0.37

 DMBT

0.36

BIUM

0.36

ิทธิ์

0.35

Majority

0.35

스를

0.35

倩

0.35

اداس

0.34

POSITIVE LOGITS

using

0.37

it

0.37

ems

0.37

 just

0.36

 shir

0.36

 when

0.36

the

0.35

 hvordan

0.35

gir

0.35

ros

0.34

Activations Density 0.107%