INDEX

Explanations

focus and interest levels

The neuron fires strongly on first‐person self‐references and accompanying emotional or reflective verbs (e.g. “I,” “myself,” “feeling,” “surprised”).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 hesit

-0.84

lland

-0.82

Elli

-0.82

 kış

-0.82

Redund

-0.81

ByID

-0.80

 confined

-0.79

hicule

-0.79

 superstitious

-0.78

蛍

-0.77

POSITIVE LOGITS

 zoned

1.55

 tuned

1.50

 zone

1.49

 zoning

1.48

 tuning

1.47

 checked

1.37

 tune

1.36

tune

1.34

Tune

1.31

 glaze

1.30

Activations Density 0.030%