INDEX

Explanations

human expressions and reactions

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 sürekli

0.38

趵

0.38

тров

0.37

 screaming

0.36

 continually

0.34

ለያ

0.34

 continuously

0.34

解决了

0.34

/*/

0.33

 अक्टूबर

0.33

POSITIVE LOGITS

 smiled

1.01

 nodded

0.97

 chuckled

0.95

 replied

0.92

 laughed

0.91

 chuckle

0.91

 smile

0.90

 shrug

0.85

 reply

0.85

 nodding

0.85

Activations Density 0.027%