INDEX

Explanations

sexual content, harmful, explicit

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

luž

0.44

superuser

0.44

ylene

0.44

ž

0.44

 horrendous

0.44

лл

0.43

kHz

0.42

ែម

0.42

اءِ

0.40

 sabbam

0.39

POSITIVE LOGITS

 conten

0.49

在意

0.49

 qualifying

0.48

屐

0.47

 contests

0.45

 dislocations

0.45

 workers

0.44

 asymmetries

0.44

 endorph

0.44

 Οι

0.43

Activations Density 0.007%