INDEX

Explanations

undermining domestic happiness

The neuron fires whenever it sees very high-frequency function words—especially the definite article “the” (often together with words like “that,” “for,” “all,” etc.)

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

보다는

0.65

දි

0.59

 പ്രശ്

0.58

each

0.57

 مختلف

0.57

 वेगवेगळ्या

0.57

／

0.57

略

0.55

 مختلفة

0.54

漏

0.54

POSITIVE LOGITS

本来

1.26

原本

1.22

 cherished

1.15

 treasured

1.09

 originally

1.04

 beloved

1.03

 normally

0.98

 hitherto

0.94

 originalmente

0.94

 innocent

0.94

Activations Density 0.602%