INDEX

Explanations

Symptoms

The neuron fires strongly on words that appear as the very first word of a sentence.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

0.55

)['

0.48

etc

0.48

'&

0.46

 concerto

0.45

 nomad

0.44

 dennoch

0.44

κ

0.44

 likewise

0.43

):

0.43

POSITIVE LOGITS

 मींस

0.59

 means

0.57

というか

0.55

ലാണ്

0.54

是指

0.54

矢

0.54

不仅

0.53

不僅

0.53

 znamená

0.52

指的是

0.52

Activations Density 0.032%