INDEX

Explanations

following prepositions or articles

This neuron detects linguistic qualifiers—words like modals (can, do), negations (not), and degree adverbs (always, mainly, due to, actually) that modify or hedge statements.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

of

-2.30

 from

-1.58

 including

-1.48

You

-1.42

 usually

-1.30

 However

-1.30

According

-1.27

 Ancak

-1.26

If

-1.25

-1.23

POSITIVE LOGITS

("")]

1.56

 eenig

1.48

ONLY

1.48

颦

1.48

 eenige

1.47

穑

1.47

}));

1.46

 butik

1.45

órica

1.44

 tehd

1.44

Activations Density 0.727%