INDEX

Explanations

extreme sentiment/controversy

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 desplaz

0.38

📈

0.37

״

0.36

 Jupyter

0.35

 COMEN

0.35

 Specifically

0.35

 Matem

0.34

"\

0.33

 Tijdens

0.33

\|

0.33

POSITIVE LOGITS

 cutest

0.43

 hatred

0.42

 hilarious

0.41

 cute

0.41

 adorable

0.39

 murderous

0.38

Cute

0.37

然后再

0.36

funny

0.36

 bigotry

0.36

Activations Density 0.000%