INDEX

Explanations

`should be` or `need to`

The neuron fires most strongly on modal and emphatic words used in instructions—like “will,” “always,” “best,” “important,” and “is”—marking strong recommendations or certainty in advice.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

și

0.97

 vrijed

0.86

 endeavour

0.83

 Quindi

0.82

 endeavours

0.82

 znač

0.80

 gdyż

0.80

 więc

0.80

 fiche

0.79

 își

0.79

POSITIVE LOGITS

 meisten

0.73

 stuffs

0.73

 pernah

0.70

tumor

0.66

flavored

0.66

 tumor

0.65

 लहान

0.65

 banyak

0.64

 memang

0.63

 autoimmune

0.63

Activations Density 0.105%