INDEX

Explanations

A followed by adjectives

The neuron fires on substantive, content‐heavy nouns (particularly abstract or key concept terms) rather than on function words or simple modifiers.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 youngest

0.86

 prettiest

0.84

 coolest

0.82

 brightest

0.81

 meisten

0.80

 entire

0.77

 fittest

0.76

 Almighty

0.75

 hardest

0.75

 funniest

0.75

POSITIVE LOGITS

or

0.75

es

0.74

을

0.72

0.69

および

0.67

를

0.66

。

0.66

에서의

0.65

리의

0.64

Activations Density 0.157%