INDEX

Explanations

past tense verbs and negations

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

icelli

0.68

 الأخرى

0.59

ână

0.56

ソコン

0.55

ន្ថ

0.54

<unused1839>

0.53

 بالإضافة

0.53

 Daarna

0.51

ことも

0.51

!).

0.51

POSITIVE LOGITS

 didnt

0.74

 doesnt

0.68

 didn

0.67

 เคย

0.67

saw

0.66

 видел

0.65

 seems

0.64

didn

0.63

 drove

0.62

俺

0.62

Activations Density 0.004%