INDEX

Explanations

be + adjective or being + adjective

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

เป็น

0.44

 isValid

0.40

을

0.38

 caches

0.38

(`

0.38

最

0.37

を使用

0.36

是一种

0.36

}-

0.36

 patches

0.35

POSITIVE LOGITS

 capazes

0.57

 able

0.53

 aware

0.49

 قادر

0.47

 tempted

0.47

 unable

0.46

 capaces

0.46

 willing

0.46

 atentos

0.44

 wary

0.43

Activations Density 0.191%