INDEX

Explanations

ugly descriptions

The neuron activates on the adjective “ugly” (and its direct mentions) whenever it appears in text.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

訁

-2.80

剮

-2.69

But

-2.58

菝

-2.31



-2.22

にほんブログ村

-2.19

 แต่

-2.19



-2.16

).

-2.14

 noti

-2.11

POSITIVE LOGITS

或

2.52

2.39

or

2.28

2.25

 fashioned

2.13

🪬

2.13

utterstock

2.08

也非常

2.06

劼

2.05

榑

2.02

Activations Density 0.002%