INDEX

Explanations

harmful, unethical, racist, sexist, toxic, dangerous, or illegal

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 ২০২২

0.70

 २०२२

0.66

🫶

0.64

🥹

0.60

🫣

0.60

🫢

0.56

🫠

0.54

 ২০২১

0.53

🥲

0.52

🪄

0.51

POSITIVE LOGITS

 coronavirus

1.30

 Coronavirus

1.28

Coronavirus

1.24

coronavirus

1.19

 कोरोनावायरस

0.97

 коронави

0.95

 করোনাভাই

0.93

 corona

0.89

 Corona

0.87

corona

0.82

Activations Density 0.005%