INDEX

Explanations

romantic interest

The neuron fires strongly on open‐class “content” words—i.e. nouns, verbs, adjectives, and adverbs carrying real semantic weight rather than on common function words.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

做什么

0.88

০

0.80

rophages

0.78

princip

0.77

ezi

0.75

nú

0.75

woven

0.75

readthedocs

0.75

do

0.74

Pepper

0.74

POSITIVE LOGITS

ли

0.76

恼

0.72

ле

0.70

 rumour

0.70

 Lithuan

0.68

 الماس

0.68

DMA

0.66

 יד

0.65

 Möglichkeit

0.65

謗

0.65

Activations Density 0.000%