INDEX

Explanations

dialogue starters and punctuation

Tokens that appear at the beginning of the model's response or mark transitions between different parts of the response structure, including acknowledgments, formatting elements, role-play indicators, and the start of actual content delivery. These tokens signal the model's engagement with unusual, creative, or instruction-following tasks that deviate from standard question-answering.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

0.29

deki

0.29

 twor

0.29

 darbo

0.29

 pristup

0.28

 gerenci

0.28

 workflow

0.27

重視

0.27

 instructive

0.27

POSITIVE LOGITS

 Button

0.30

 Merriam

0.30

 Kimberly

0.30

ampton

0.29

 маленький

0.28

 Bethany

0.27

 неожидан

0.27

ână

0.27

 Australian

0.27

改变

0.27

Activations Density 0.700%