INDEX

Explanations

making promises

The neuron strongly activates on words expressing a pledge or promise (e.g. “promise,” “promises,” “pledged,” “promised”).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

hamilan

-1.76

 TAMBÉM

-1.57

 pól

-1.49

pú

-1.45

conocimiento

-1.45

adillas

-1.45

 drodze

-1.41

昨日

-1.41

cuerdo

-1.41

 glä

-1.40

POSITIVE LOGITS

or

1.85

as

1.84

et

1.73

he

1.72

is

1.60

 However

1.57

al

1.55

ine

1.53

ot

1.52

not

1.51

Activations Density 0.008%