INDEX

Explanations

pronouns refer to self or others

The neuron fires on personal, participant-referencing pronouns and related words (e.g. “I,” “me,” “you,” “us,” “him,” “here,” “one”), marking direct speaker/audience references.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

0.59

0.41

0.39

0.38

0.37

0.36

↵

0.36

0.35

POSITIVE LOGITS

StoredKeys

0.43

 persönlich

0.40

ſelf

0.39

្នុង

0.39

eredith

0.39

zelf

0.39

ಲೆಯ

0.39

 Bacteriol

0.38

squarePos

0.38

EnglishMarks

0.38

Activations Density 0.873%