INDEX

Explanations

describing humans or personal characteristics

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

几何

0.35

可视化

0.33

物理

0.32

 системы

0.32

ToolStrip

0.31

 инфраструк

0.31

散热

0.30

 cinematographer

0.30

ୋ

0.30

 উপকূল

0.30

POSITIVE LOGITS

 Biographical

0.37

 인간

0.37

 humain

0.37

 humanas

0.36

 personal

0.36

 humana

0.35

 mammalian

0.35

 Personnel

0.33

 Personal

0.33

 نمی

0.32

Activations Density 0.499%