titles and subsequent words

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

0.61

<start_of_image>

0.54

 using

0.45

..

0.44

 these

0.43

 member

0.42

 depending

0.41

 here

0.41

POSITIVE LOGITS

<unused1954>

0.76

<unused162>

0.76

<unused1834>

0.76

<unused1153>

0.76

<unused1992>

0.74

<unused1678>

0.73

<unused305>

0.73

<unused1845>

0.73

<unused565>

0.72

<unused1101>

0.71

Activations Density 0.029%