INDEX

Explanations

implementing active concepts

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

メージ

0.43

 kest

0.39

rous

0.39

ڃ

0.39

 distin

0.38

 ngunit

0.38

 Cependant

0.38

 เอ่อ

0.38

 fejl

0.37

 neither

0.37

POSITIVE LOGITS

 적극

0.43

荠

0.43

 использовать

0.42

యిత

0.42

 মুহাম্ম

0.42

 Zentrum

0.41

ötet

0.41

 implementar

0.40

 Methodology

0.40

 активно

0.39

Activations Density 0.022%