INDEX

Explanations

pronouns

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

<eos>

-0.57

featureID

-0.53

in

-0.50

dah

-0.47

KURZBESCHREIBUNG

-0.46

 step

-0.45

 piece

-0.45

 mistake

-0.45

Curi

-0.45

 question

-0.44

POSITIVE LOGITS

 חיצוניים

0.85

<bos>

0.75

\{\\

0.70

 AttributeSet

0.67

الحياه

0.64

例文帳に追加

0.61

 reads

0.61

ValueStyle

0.60

 announces

0.59

rrggbb

0.59

Activations Density 0.045%