INDEX

Explanations

child exploitation and safety

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Senn

1.06

 молодо

0.98

 senior

0.97

ез

0.97

 شاب

0.92

 teenager

0.91

о

0.90

 young

0.90

enig

0.88

eev

0.84

POSITIVE LOGITS

swear

1.44

们的

1.17

 लेट्स

1.17

🧒

1.04

 prodig

0.96

wunsch

0.95

 playroom

0.95

 emperor

0.94

 rhymes

0.94

PLAYS

0.94

Activations Density 0.114%