INDEX

Explanations

suffix -ny, -ism, -ery, -valry OR prefix "ant-", "miso-", "pedo-"

The neuron consistently lights up on words that denote ideological biases or hatreds (–isms and –phobias), such as misogyny/misogynistic, transphobic, nihilism, misanthropy, etc.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

읭

-1.97

 importanza

-1.96

 envisioned

-1.88

 нужен

-1.88



-1.83

 revelado

-1.81

PUESTA

-1.80

睥

-1.80

怍

-1.78

沺

-1.77

POSITIVE LOGITS

2.73

↵

2.50

2.02

1.83

1.78

1.69

↵↵

1.67

―

1.60

</b>

1.58

 هم

1.58

Activations Density 0.003%