INDEX

Explanations

descriptions of subtle states

The neuron detects intense, sentiment-laden descriptive words—especially strong adjectives and adverbs (e.g. “downright,” “deadly,” “predictably,” “numbingly”) used to convey critique or emphasis.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

actéristiques

-1.49

 différents

-1.45

thodoxy

-1.30

がとても

-1.29

暗暗

-1.28

caping

-1.27

瞟

-1.27

 ۳۰

-1.26

などは

-1.24

reonine

-1.24

POSITIVE LOGITS

 seguras

1.39

 mismas

1.30

 utterly

1.20

 both

1.20

 variadas

1.13

 profoundly

1.12

 only

1.11

 nadru

1.09

 temperaturas

1.08

 Masalah

1.07

Activations Density 0.029%