INDEX

Explanations

explaining emergent phenomena or behaviors

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

няў

0.45

 యొక్క

0.42

ギフト

0.42

如果您

0.42

ផង

0.41

)?

0.40

 sogar

0.40

 якщо

0.40

當

0.40

見

0.39

POSITIVE LOGITS

 realtime

0.43

 coord

0.41

 acrylonitrile

0.41

它是

0.41

 polystyrene

0.40

 octopus

0.39

 semic

0.39

 monthly

0.39

<0xC2>

0.39

Activations Density 0.016%