INDEX

Explanations

protect the innocent/guilty

The neuron detects statements that personal names or identifying details have been redacted or changed to protect someone’s privacy.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 pirata

-1.68

 parati

-1.39

 itin

-1.36

 makam

-1.34

 bilang

-1.32

opies

-1.32

 tortura

-1.30

 meras

-1.30

 tuer

-1.28

 turno

-1.27

POSITIVE LOGITS

be

1.71

to

1.34

if

1.30

 also

1.30

1.27

霄

1.20

 there

1.20

憐

1.20

匿

1.20

 from

1.18

Activations Density 0.028%