INDEX

Explanations

ethical actions, karma, exploitation, imperialism

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Fear

0.40

Fear

0.39

 uncontrollable

0.38

 చా

0.38

 Expenditure

0.38

 sagging

0.38

囱

0.38

 Poverty

0.38

 Advertising

0.37

Advertising

0.37

POSITIVE LOGITS

 कित

0.42

 wronged

0.42

 ethically

0.41

 crimes

0.40

 ethical

0.40

deeds

0.39

 ethics

0.39

 restitution

0.39

 wrongs

0.39

鲇

0.39

Activations Density 0.049%