INDEX

Explanations

might / will / have

The neuron detects hedging or generalizing qualifiers—words like “some,” “many,” or “might argue” that introduce opinions or broad generalizations.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

From

-1.84

-1.77

Our

-1.73

Your

-1.71

他にも

-1.66

 cómod

-1.64

 Deletes

-1.62

What

-1.59

any

-1.58

さんも

-1.57

POSITIVE LOGITS

༘

1.85

 grandeza

1.73

丷

1.71

⢅

1.68

 înal

1.66

стойчи

1.66

🡺

1.63

珦

1.63

⩤

1.63

 themselves

1.61

Activations Density 0.017%