INDEX

Explanations

potential for actions

This neuron selectively activates on contentful action words and agent nouns—terms that describe who’s doing what (e.g. “guys,” “servers,” “coming,” “talk,” “seeing”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

讓人

0.75

让人

0.69

 নিজেই

0.69

 mình

0.64

させ

0.64

 নিজেও

0.64

 নিজে

0.63

为人

0.63

擔任

0.63

 خود

0.62

POSITIVE LOGITS

 perceive

0.86

纷纷

0.81

 flocked

0.81

 perceiving

0.80

 flock

0.79

 인식

0.77

 समझेंगे

0.77

 perceptions

0.72

 misunderstand

0.71

 perceives

0.71

Activations Density 0.356%