INDEX

Explanations

disabling specific things

The neuron is triggered by the occurrence of the token “disable.”

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

-3.36

-3.33

↵↵

-3.22

-3.20

-2.75

-2.56

 حتى

-2.48

–

-2.45

POSITIVE LOGITS

澌

3.41

谵

2.63

涑

2.61

了一个

2.56

 сист

2.55

潆

2.53

ጺ

2.52

‪.‬‬

2.48

瘓

2.48

Ꮴ

2.47

Activations Density 0.008%