INDEX

Explanations

expressions of apology

The neuron detects apology phrases (instances of “sorry” and its immediate follow-on apology wording).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 each

-1.38

Vendo

-1.32

 both

-1.27

bảng

-1.27

how

-1.27

Mvh

-1.26

yyv

-1.23

管理器

-1.21

 helps

-1.20

 suggests

-1.18

POSITIVE LOGITS

\"

1.24

É

1.24

 مترجم

1.22

 自己

1.20

if

1.20

↵

1.18

1.16

1.13

im

1.11

Activations Density 0.013%