INDEX

Explanations

The names of concepts

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Specifically

0.39

ى

0.39

ร่าง

0.36

😅

0.35

with

0.35

最近

0.35

只限平日

0.35

呈

0.35

With

0.34

 برای

0.34

POSITIVE LOGITS

odore

0.54

atrical

0.52

 plight

0.44

 odyssey

0.42

 story

0.41

ка

0.41

 ulcer

0.39

 beggar

0.39

mselves

0.39

 impact

0.38

Activations Density 0.122%