INDEX

Explanations

great positive evaluations

The neuron detects enthusiastic praise and positive evaluative language (e.g. compliments and glowing adjectives).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

PILE

0.40

dbo

0.40

บาง

0.39

PCI

0.38

特定の

0.38

://

0.37

Università

0.37

POSITIVE LOGITS

 👍

0.59

 mooie

0.58

👍

0.58

 güzel

0.55

 mooi

0.55

 schön

0.53

 идея

0.53

 teamwork

0.52

 camaraderie

0.51

 artwork

0.51

Activations Density 0.218%