INDEX

Explanations

confusing things and confusion

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

檢

0.42

 Constraints

0.41

 തീ

0.40

 `>=`,

0.37

 tật

0.37

ッケ

0.37

檄

0.36

可行

0.36

 sappiamo

0.36

 экстре

0.35

POSITIVE LOGITS

 confusion

3.80

 confused

3.39

 confuse

3.39

Confusion

3.39

 confusing

3.38

 Confusion

3.38

confusion

3.31

 confusión

3.25

 confuses

3.17

 confus

3.11

Activations Density 0.151%