INDEX

Explanations

introduces new updates

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

howto

0.42

 ווי

0.41

 Fatty

0.38

 Mutant

0.38

 Neuroscience

0.37

 Wicked

0.37

Pete

0.36

 حوالے

0.35

푥

0.35

водство

0.35

POSITIVE LOGITS

exit

0.48

 exit

0.47

Exit

0.45

 exits

0.41

 Schall

0.41

 four

0.40

rianças

0.40

 verify

0.40

 Exit

0.39

 assumption

0.39

Activations Density 0.001%