INDEX

Explanations

start to, root to, flaws and

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

]{

0.44

મારા

0.40

 ලෙස

0.40

foss

0.39

 яким

0.37

൮

0.37

猁

0.37

 Prendre

0.36

 кантип

0.36

 dimana

0.36

POSITIVE LOGITS

AND

0.47

 protector

0.42

 waistcoat

0.42

 maupun

0.41

 warts

0.41

Ĺ

0.38

 aforesaid

0.38

foe

0.38

 względu

0.37

 countryside

0.37

Activations Density 0.030%