INDEX

Explanations

scandals and abuses

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 കർ

0.47

 unintentional

0.45

 अशुभ

0.44

 неуда

0.44

 involuntarily

0.43

 البيت

0.42

 unintentionally

0.42

 meetup

0.42

 unrecognized

0.41

邑

0.41

POSITIVE LOGITS

 disgraceful

0.79

 outrageous

0.65

 ridiculous

0.64

 appalling

0.63

 unbelievable

0.62

 ludicrous

0.62

 astounding

0.61

 disgusting

0.59

 scandalous

0.59

 disgrace

0.57

Activations Density 0.011%