INDEX

Explanations

measure of prediction or classification

This neuron fires strongly on bibliographic or citation metadata—years, author names, publication types (e.g. “thesis,” “MBA essays”), legal‐case abbreviations (“Ins.,” “Co.”), and similar reference tokens.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

prescription

0.73

住在

0.72

سته

0.71

yd

0.70

땠

0.69

栏

0.68

riente

0.66

你在

0.66

 Prescription

0.66

column

0.66

POSITIVE LOGITS

당

0.76

CDR

0.76

পাতি

0.74

 príp

0.73

 farther

0.73

Naive

0.73

ில்

0.71

CLN

0.71

 국가

0.70

 Ceci

0.70

Activations Density 0.001%