INDEX

Explanations

simple vs complex outcome

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 частично

0.42

الله

0.40

 हद

0.39

 nejen

0.38

毫无

0.37

不仅

0.37

ahoo

0.36

无疑

0.36

Ꮭ

0.36

OGRAM

0.36

POSITIVE LOGITS

 pourtant

0.91

 nevertheless

0.88

 nonetheless

0.79

 trotzdem

0.79

 disproportion

0.78

 dennoch

0.76

 impactful

0.69

 Nonetheless

0.68

Nevertheless

0.66

 Nevertheless

0.65

Activations Density 0.019%