INDEX

Explanations

first historical facts

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 protects

0.46

썼

0.44

됐

0.43

 Protect

0.42

 melindungi

0.42

 concisely

0.42

如何

0.41

 exposes

0.41

🧂

0.40

 mediates

0.40

POSITIVE LOGITS

 perust

0.52

 stairs

0.50

 Exhibition

0.48

 trunk

0.45

 leisurely

0.44

JV

0.43

 trunks

0.42

StartState

0.42

jib

0.41

 जा

0.40

Activations Density 0.004%