INDEX

Explanations

saying no appropriately

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ו

0.40

ਛ

0.39

 சிறிது

0.39

paar

0.39

стта

0.38

ամ

0.38

ुअल

0.38

аст

0.38

츰

0.38

ፃ

0.37

POSITIVE LOGITS

 appropriately

0.75

 correctly

0.69

 appropriate

0.68

 সঠিকভাবে

0.61

 properly

0.61

 megfelelő

0.61

 correctamente

0.59

適切な

0.56

 مناسب

0.55

 corretamente

0.54

Activations Density 0.074%