INDEX

Explanations

said or wrote followed by a colon

This neuron detects speaker‐attribution phrases (verbs like “said” or “wrote” introducing a quote).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

â

-0.88

breviations

-0.83

অ

-0.82

这下

-0.81

telé

-0.81

edc

-0.80

―

-0.79

ḳ

-0.79

 nicole

-0.79

laston

-0.78

POSITIVE LOGITS

1.22

 leeren

0.97

avocat

0.95

Ингредиенты

0.92

deliver

0.90

 vorhandenen

0.90

 sayang

0.89

said

0.88

intéress

0.88

 ekolog

0.87

Activations Density 0.002%