INDEX

Explanations

better to overestimate than underestimate

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

aklar

0.42

鐒

0.40

 solitons

0.40

 Announcement

0.39

 erbjuder

0.38

URUK

0.38

 kär

0.38

 democratic

0.37

 stripes

0.37

usk

0.37

POSITIVE LOGITS

 মন্ত্রণাল

0.48

ة

0.48

 показатель

0.45

었다

0.45

Чтобы

0.45

ية

0.44

،

0.44

၊

0.44

ндо

0.43

мин

0.43

Activations Density 0.005%