INDEX

Explanations

trivial statements or affirmations

The neuron fires on downplaying qualifiers—especially “just” (and similar minimizers like “only” or “another”) used to diminish or dismiss a following noun phrase.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 gracilis

-0.76

头像

-0.74

UIManager

-0.73

 Bres

-0.73

fillRect

-0.72

ちゃんと

-0.72

孢

-0.71

 guineas

-0.71

確

-0.70

ricts

-0.69

POSITIVE LOGITS

又不是

1.00

 merely

0.94

ظهار

0.94

品です

0.91

Although

0.90

tował

0.85

Betyg

0.85

不过是

0.84

Neden

0.83

trivial

0.83

Activations Density 0.039%