INDEX

Explanations

PR Review, Python, Attention Mechanisms

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 약간

0.46

 அழகான

0.39

 ಸ್ವಲ್ಪ

0.37

 слегка

0.37

pubescens

0.37

<0x15>

0.36

 गिरफ्तार

0.36

 있습니다

0.35

আত

0.35

Pyrid

0.35

POSITIVE LOGITS

0.53

 particolare

0.52

 khususnya

0.50

 specifically

0.49

 particular

0.44

 notorious

0.44

 particularly

0.44

 speziell

0.42

izar

0.41

Activations Density 0.078%