INDEX

Explanations

positive attitude and enthusiasm

The neuron activates on words that express personal agency, attitudes, or character traits (e.g. freedom, try, can, diligent, jealous, halfheartedly).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 albi

0.80

 reliefs

0.80

 procès

0.79

 albums

0.75

anço

0.75

 pointillés

0.75

تبقى

0.74

 salon

0.72

下载

0.72

 Sculpture

0.71

POSITIVE LOGITS

 willingness

1.47

 selfless

1.37

 Willing

1.27

willing

1.25

 willing

1.23

 selfishness

1.19

 eagerness

1.16

態度

1.16

 unwillingness

1.16

 attitude

1.13

Activations Density 1.844%