INDEX

Explanations

positive affirmations and praise

The neuron strongly activates on enthusiastic praise and positive‐sentiment expressions (e.g. “beautiful,” “thank you,” “!”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

并非

0.79

并不是

0.67

如果我们

0.67

没办法

0.65

 якщо

0.65

 wrongfully

0.64

 אם

0.63

だから

0.63

냐면

0.63

 falsely

0.61

POSITIVE LOGITS

 Congratulations

1.05

 congratulations

1.03

 congrats

1.03

 Congrats

1.02

Congratulations

0.98

 admirable

0.96

 impressive

0.95

Congrats

0.91

 beeindruck

0.90

 congrat

0.88

Activations Density 0.935%