INDEX

Explanations

giving instructions or commands

The neuron fires on content-rich words—especially nouns, verbs, names, and role titles—i.e. tokens carrying strong semantic importance rather than function words.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

hours

-0.98

perrt

-0.98

juje

-0.97

straint

-0.87

hrs

-0.86

Hours

-0.85

loride

-0.83

 jour

-0.83

 Inbox

-0.83

Ouverture

-0.82

POSITIVE LOGITS

 instructions

1.99

 commands

1.79

 instruction

1.77

 instructing

1.71

 coaching

1.61

 verbal

1.60

 directing

1.60

 communication

1.54

 verbally

1.48

 instruct

1.46

Activations Density 0.024%