INDEX

Explanations

unethical or racist bot

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

τ

0.46

 божомолдору

0.46

糈

0.46

 चल

0.43

к

0.42

াস্থ্য

0.41

 cục

0.41

RetResult

0.41

ϲ

0.40

tau

0.40

POSITIVE LOGITS

 principles

0.52

 ants

0.46

 decept

0.46

 completely

0.45

 passivation

0.45

 guides

0.44

 totalitarian

0.44

 perfect

0.44

 acceptance

0.44

 anti

0.44

Activations Density 0.015%