INDEX

Explanations

thoughts related to harmful or sexual content

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Seesaw

1.41

 axiom

1.38

েও

1.35

↵

1.32

 Yoruba

1.28

 metaphors

1.25

be

1.23

 Congolese

1.23

 বটে

1.21

 của

1.20

POSITIVE LOGITS

ول

2.52

ل

2.36

ある

1.87

نت

1.73

ર

1.72

皞

1.71

б

1.69

iz

1.64

وب

1.64

ция

1.64

Activations Density 0.013%