INDEX

Explanations

bootstrap and increased values

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

$-(

0.58

(-(

0.52

//$

0.51

-(

0.50

//}

0.49

-(

0.48

}$-(

0.41

(!(

0.40

 allegedly

0.39

//{

0.38

POSITIVE LOGITS

 разум

0.44

 Bootstrap

0.44

 Thats

0.43

Thats

0.41

 Increased

0.40

 increased

0.40

 bootstrap

0.39

Increased

0.39

bootstrap

0.39

 Boot

0.38

Activations Density 0.001%