INDEX

Explanations

hazard identification and assessment

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

2.11

й

2.11

ている

2.08

к

2.00

де

1.98

 ante

1.98

р

1.91

 unbreakable

1.88

मध्ये

1.87

POSITIVE LOGITS

zne

2.00

 dolayı

1.95

ным

1.84

 Большая

1.80

},

1.78

ously

1.77

unuz

1.77

لية

1.73

}$,

1.70

Activations Density 0.003%