INDEX

Explanations

moral and ethical considerations

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 playfully

0.44

 hostname

0.43

 استقبال

0.42

 পরিকল্প

0.40

憶

0.40

 dispersion

0.40

 carbure

0.40

を楽し

0.40

]+\

0.39

 playful

0.39

POSITIVE LOGITS

 moral

2.59

 ethical

2.47

 Moral

2.47

道德

2.45

 ethics

2.42

 morality

2.38

Moral

2.38

 morally

2.36

 Ethical

2.36

moral

2.34

Activations Density 0.473%