INDEX

Explanations

and Canada

references to suicide/crisis/self‑harm hotlines and emergency mental‑health support contact information.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

立方

0.39

Valent

0.39

bandit

0.38

ヴィトン

0.38

パッド

0.38

丁

0.37

 מק

0.37

瑯

0.36

 longitudinally

0.35

 гидро

0.35

POSITIVE LOGITS

Su

0.37

dech

0.35

кет

0.35

Su

0.35

kan

0.33

 আশে

0.33

su

0.33

Kan

0.33

nica

0.32

 IEEE

0.32

Activations Density 0.011%