INDEX

Explanations

verbs introducing claims or beliefs

The neuron fires on words that attribute ideas or opinions—i.e. verbs like “assume,” “believe,” “argue,” “thought,” “named,” “view,” etc., which introduce or report theories, beliefs, or judgments.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

Tabela

-1.09

{},

-1.07

[],

-1.05

ഔ

-1.04

ַּ

-1.03

laştır

-1.02

 různ

-1.00

ִּ

-0.98

ᾧ

-0.98

 diferite

-0.95

POSITIVE LOGITS

 that

1.27

 even

1.19

 only

1.16

стоит

1.09

 chré

1.06

埜

1.01

 Unterricht

1.00

to

0.99

too

0.99

omorphisms

0.99

Activations Density 0.024%