INDEX

Explanations

food meals and dishes

The neuron fires on emphatic positive or praising language—strongly evaluative adjectives and recommendation words conveying enthusiasm (e.g. “perfect,” “refreshing,” “must‐try”).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

וּ

-1.88

 бампер

-1.29

וּ

-1.28

 hvordan

-1.23

取り付け

-1.22

ִּ

-1.22

ּוֹ

-1.22

ֶּ

-1.22

cfg

-1.20

alogical

-1.19

POSITIVE LOGITS

 ویتامین

1.75

ׇ

1.69

ܐ

1.65

 cukru

1.59

ֺ

1.54

 витами

1.43

dä

1.38

 chauffage

1.34

 horloge

1.30

 réfrigérateur

1.27

Activations Density 0.052%