INDEX

Explanations

toxic and harmful substances

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 수요

0.38

 логи

0.37

 একাডেম

0.35

 ভার্চ

0.33

业务

0.33

 SharePoint

0.33

롤

0.33

鰱

0.32

버

0.32

宾

0.32

POSITIVE LOGITS

 toxins

0.92

 toxic

0.85

 poisonous

0.84

 toxicity

0.80

 contaminants

0.78

 токси

0.76

 poisons

0.74

 carcinogenic

0.74

 carcin

0.71

toxic

0.69

Activations Density 0.184%