INDEX

Explanations

refusal to generate explicit content

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ilerini

0.39

ፓ

0.37

 телефон

0.36

TorpedoStore

0.35

harga

0.34

cargar

0.34

rollees

0.34

θή

0.34

 เหตุ

0.34

 प्रका

0.34

POSITIVE LOGITS

$=\

0.39

 unpredict

0.39

$+\

0.39

 persistently

0.38

$[\

0.38

 dating

0.37

 vividly

0.37

AR

0.36

$\

0.36

 Dating

0.36

Activations Density 0.002%