INDEX

Explanations

positive qualities or assessments

This neuron detects positive evaluative adjectives (e.g., good, great, best, excellent) used to praise or promote something.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 koste

-0.93

 cewek

-0.91

 vekt

-0.90

 ekst

-0.86

shutterstock

-0.85

 koke

-0.84

 kalori

-0.84

 leveren

-0.84

Shutterstock

-0.83

፨

-0.81

POSITIVE LOGITS

 performance

1.06

PhysRev

1.05

 conditions

1.05

 choice

1.04

 results

1.02

 predisposition

0.98

 relationship

0.97

 combination

0.95

 deal

0.94

 mood

0.92

Activations Density 0.014%