INDEX

Explanations

uncovering secrets and exposing wrongdoing

The neuron fires strongly on words naming or signaling investigative exposés—terms like “revelations,” “investigation,” “exposes,” “uncovered,” and similar that announce the disclosure of hidden or secret information.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

for

-1.50

-1.34

-1.30

styleUrls

-1.26

<i>

-1.24

-1.23

if

-1.23

 mensagem

-1.22

-1.20

POSITIVE LOGITS

וּ

1.61

』

1.55

vierten

1.48

觊

1.41

 that

1.37

fié

1.34

 khiến

1.34

Instead

1.33

Surprisingly

1.32

but

1.31

Activations Density 0.068%