INDEX

Explanations

`location` or `baseline`

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

infodisc

0.46

！

0.45

 Harlan

0.44

 تشتغل

0.42

 plummeted

0.40

 Bilal

0.40

鳌

0.40

फट

0.39

 ترین

0.39

ंपल

0.39

POSITIVE LOGITS

Fig

0.44

 automatically

0.44

 total

0.42

 automatic

0.41

del

0.41

 image

0.40

 masks

0.40

 directions

0.40

ैनिक

0.40

 process

0.39

Activations Density 0.000%