INDEX

Explanations

ignore certain things

The neuron is a strong detector for personal/prosodic function words—especially pronouns and hedging adverbs (e.g. “you,” “they,” “we,” “maybe,” “apparently”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 Sudden

0.61

骤

0.61

驟

0.58

缺少

0.56

 Зу

0.56

 sudden

0.52

 Reproduced

0.52

 tentative

0.52

 затруд

0.51

ాల్సి

0.50

POSITIVE LOGITS

 ignore

2.83

 Ignore

2.59

 ignoring

2.53

 disregard

2.52

Ignore

2.50

ignore

2.48

 ignores

2.44

 ignored

2.41

無視

2.38

 disregarding

2.33

Activations Density 0.417%