INDEX

Explanations

screw up, mess up

The neuron activates on informal/slang verbs and verb phrases that denote making a mistake or causing failure (e.g. “screw up,” “mess up,” “ruin”).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

鼇

-1.70

 fald

-1.60

emplares

-1.56

 veneta

-1.54

 umid

-1.44

谂

-1.41

罫

-1.41

璈

-1.41



-1.41

 lyder

-1.40

POSITIVE LOGITS

1.92

↵

1.59

 That

1.48

 However

1.45

1.43

1.41

 noting

1.40

 Also

1.38

que

1.37

Activations Density 0.040%