INDEX

Explanations

right and wrong

This neuron detects the phrase pattern “in the right…” or “in the wrong…,” i.e. short location or direction expressions like “step in the right direction,” “posting in the wrong section,” and “point me in the right direction.”

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 extremely

-0.81

 solche

-0.80

 flexível

-0.79

 such

-0.79

 главных

-0.79

its

-0.78

 Gedicht

-0.77

 this

-0.77

 isolada

-0.77

婁

-0.76

POSITIVE LOGITS

the

3.16

 đúng

2.03

 rätt

1.98

 juiste

1.68

 right

1.68

 wrong

1.63

 прави

1.63

 सही

1.52

 correct

1.50

 richtigen

1.48

Activations Density 0.061%