INDEX

Explanations

intentionality and awareness of actions

The neuron responds to words that signal deliberate or intentional actions (e.g. “intentionally,” “deliberate,” “intentional misconduct”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 আপাতত

0.81

 रहेंगे

0.76

 বর্তমানে

0.75

 Hopefully

0.75

 yapılacak

0.73

 تړل

0.73

 நடைபெறும்

0.72

これから

0.72

 Currently

0.70

 जाणार

0.70

POSITIVE LOGITS

 consciously

1.15

 knowingly

1.02

 decisions

1.01

consciously

1.00

 intentionally

0.97

 unwitting

0.96

 deliberately

0.95

did

0.94

 bewust

0.93

 decisiones

0.91

Activations Density 0.247%