INDEX

Explanations

would followed by verb

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 geeft

0.53

 glad

0.46

0.45

 люблю

0.43

 myself

0.42

 mulig

0.42

 sometimes

0.42

侮

0.42

 Glad

0.41

POSITIVE LOGITS

 customarily

0.53

 benötigt

0.50

 alcançar

0.50

 Presumably

0.47

 nécessairement

0.47

 necessarily

0.47

有着

0.47

 necessariamente

0.46

 presumably

0.46

 понадоби

0.46

Activations Density 0.003%