INDEX

Explanations

legal restrictions

refusal and safety‑disclaimer passages explaining why a request can’t be fulfilled, often formatted with bolded headings, parentheses/slashes, and bullet lists.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

の高い

0.44

 visualization

0.43

 visualized

0.42

 fadeInLeft

0.42

 debugging

0.41

 wybrać

0.41

 Hamming

0.40

 благоприят

0.39

を楽し

0.39

 simulation

0.39

POSITIVE LOGITS

 legally

1.52

 legal

1.34

 legality

1.30

 कानूनी

1.27

legal

1.26

 قانونی

1.22

Legal

1.16

法律

1.15

 legales

1.14

 Legal

1.13

Activations Density 0.235%