INDEX

Explanations

kindness, support, and values

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 palpable

0.65

 dynamic

0.65

 edge

0.61

 young

0.60

 condition

0.60

 collar

0.59

 tack

0.59

 anchor

0.59

 seal

0.59

ও

0.59

POSITIVE LOGITS

Justice

0.75

Service

0.74

 Serving

0.74

 Loot

0.71

贡献

0.70

Kind

0.70

幫助

0.69

Serving

0.69

善良

0.69

服務

0.68

Activations Density 0.232%