INDEX

Explanations

abstract concepts and virtues

The neuron strongly activates on single, weighty abstract or philosophical nouns—words like “truth,” “beauty,” “sin,” “death,” “illusion,” etc.—marking major existential or moral concepts.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

Almost

-1.23

Each

-1.20

Start

-1.20

leggen

-1.13

for

-1.10

 dadas

-1.09

וּ

-1.08

 inggris

-1.06

About

-1.05

Get

-1.04

POSITIVE LOGITS

 itself

1.57

 deceit

1.14

its

1.13

 prevenção

1.08

ifr

1.06

があれば

1.05

ufc

1.05

その

1.05

かもしれませんが

1.05

 propriedades

1.05

Activations Density 0.019%