INDEX

Explanations

polite refusal or contrarian

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

取决于

0.49

加上

0.40

レベル

0.38

 attaches

0.38

 incorporates

0.37

 thighs

0.37

 fleurs

0.36

ܩ

0.36

 уровня

0.36

By

0.35

POSITIVE LOGITS

 Instead

0.70

 disappointing

0.69

 대신

0.68

代わりに

0.68

 disappoint

0.66

 Sorry

0.66

 disappointed

0.64

 instead

0.63

 निराश

0.63

 sorry

0.60

Activations Density 0.883%