INDEX

Explanations

potential consequences and benefits

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

𝐀

1.84

𝐤

1.72

िक

1.71

ഗീയ

1.66

𝐢

1.61

너

1.56

⠀⠀⠀⠀

1.54

 treads

1.51

ण

1.51

𝐓

1.50

POSITIVE LOGITS

nameof

1.57

 trò

1.43

 hogy

1.41

 seorang

1.41

 indica

1.41

সব

1.37

 overwhelming

1.37

 accro

1.36

 придется

1.36

 खतरनाक

1.36

Activations Density 0.130%