INDEX

Explanations

daring actions

The neuron fires on multiword threat or challenge constructions expressing opposition (e.g. “to be reckoned with,” “those who oppose them,” “if you have a problem with that”).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

Мето

-0.90

充電

-0.79

ーん

-0.78

IORITY

-0.77

 menjelaskan

-0.77

vaig

-0.76

atorze

-0.76

 horrid

-0.75

automat

-0.75

mutlich

-0.75

POSITIVE LOGITS

 dare

2.36

 dared

2.22

 dares

1.95

 attempt

1.63

 Dare

1.60

dare

1.60

Dare

1.55

 challenge

1.52

try

1.52

敢

1.51

Activations Density 0.027%