INDEX

Explanations

restrictive or complex features

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 GATE

0.42

 gateway

0.42

 করিতেছি

0.41

 Kitchen

0.41

 transcribe

0.41

碳

0.41

Kai

0.40

0.39

↵

0.38

 Fetch

0.38

POSITIVE LOGITS

 overruling

0.52

backs

0.50

ierungs

0.47

enemy

0.46

 aroused

0.46

បង្

0.46

 overruled

0.44

 angered

0.44

 اخت

0.44

 دیگر

0.44

Activations Density 0.001%