INDEX

Explanations

number representation and levels

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 niezwy

0.73

Ab

0.68

 három

0.68

department

0.66

 veoma

0.66

special

0.66

another

0.65

ামূলক

0.64

兩

0.64

avatth

0.64

POSITIVE LOGITS

 connotation

0.89

 versions

0.87

😑

0.85

 بودن

0.83

 connotations

0.83

 progression

0.82

 depiction

0.82

 เพราะ

0.81

 버전

0.80

 unless

0.80

Activations Density 0.176%