INDEX

Explanations

expressing confusion or difficulty

This neuron fires on first-person statements—especially occurrences of “I” (and related forms like I’m, my, feel, like, etc.) indicating the speaker referring to themselves.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 チラ

-1.07

kug

-1.03

 ヌード

-0.96

 yaka

-0.95

ņu

-0.94

铆

-0.93

しかし

-0.93

vay

-0.93

ersatz

-0.93

 pérd

-0.92

POSITIVE LOGITS

 actitudes

1.19

 scary

1.05

bad

1.05

 weird

1.05

 influenced

1.02

 termed

1.01

 complicated

1.01

 aggressive

0.98

 stress

0.96

 curving

0.94

Activations Density 0.085%