INDEX

Explanations

humanity and individuality

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 সুরে

0.40

မ့်

0.40

фр

0.39

 OutputStream

0.39

 навіть

0.38

藜

0.38

校

0.37

 Purpose

0.37

踮

0.37

Lie

0.36

POSITIVE LOGITS

human

0.87

 flesh

0.79

 human

0.76

 челове

0.76

 humanity

0.74

 manusia

0.73

 humanidad

0.72

 മനുഷ്യ

0.72

individual

0.70

人間

0.69

Activations Density 0.019%