INDEX

Explanations

offer instead

This neuron detects discourse markers that introduce explanations or assertions of knowledge, such as “what you know is,” “what this tells us is,” or “what I do know.”

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 entails

0.77

 είναι

0.74

 явля

0.71

 هست

0.70

 ligger

0.70

 była

0.69

 remains

0.66

 была

0.66

 является

0.62

 blir

0.62

POSITIVE LOGITS

Mostly

0.70

されない

0.65

したのは

0.65

 Mostly

0.63

少なくとも

0.62

 केलेल्या

0.61

され

0.60

Basically

0.59

 repente

0.59

 गरिएको

0.59

Activations Density 0.066%