INDEX

Explanations

absolutely cannot and will not

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ূলে

0.43

 identifiable

0.41

राध

0.37

の子

0.37

 chants

0.37

녁

0.37

]}'

0.36

 기준

0.36

 chatt

0.36

パク

0.36

POSITIVE LOGITS

 advisable

0.75

 encouraged

0.69

 توصیه

0.64

 recommended

0.61

推奨

0.60

 discouraged

0.59

probability

0.56

Recommended

0.55

 рекомендуется

0.55

 desirability

0.55

Activations Density 0.227%