INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

lisher

1.06

entation

1.06

scrib

1.04

блон

0.97

dler

0.97

 cosecha

0.96

姻

0.95

ipada

0.95

筵

0.93

 symplect

0.90

POSITIVE LOGITS

 protections

2.19

Policies

2.05

 protect

1.98

 precautions

1.97

 protection

1.96

 safeguard

1.95

 policies

1.93

 initiatives

1.93

保障

1.92

 concerns

1.91

Activations Density 0.857%

No Known Activations