INDEX

Explanations

friendly followed by a noun

The neuron activates on hyphenated “-friendly” descriptors (e.g. family-friendly, kid-friendly, newbie-friendly, etc.).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

you

-1.80

 there

-1.54

нии

-1.52

 pacar

-1.47

 some

-1.47

 kedu

-1.45

 gelar

-1.45

-1.44

 continúas

-1.38

 televisiva

-1.36

POSITIVE LOGITS

又不

1.85

1.75

楹

1.65

却不

1.64

₶

1.62

 suoi

1.61

Its

1.59

 saine

1.49

瓮

1.47

 ۱۴

1.46

Activations Density 0.007%