INDEX

Explanations

gender and orientation

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

¹.

-1.16

从而

-1.05

畢業

-1.02

曲を

-0.99

ﺅ

-0.98

而成

-0.97

‌شده

-0.96

講師

-0.94

,....

-0.94

 Longitud

-0.94

POSITIVE LOGITS

 EconPapers

1.28

 Figure

1.22

 trinh

1.17

fören

1.16

 Οι

1.15

"{\

1.15

 März

1.15

 Alert

1.14

≔

1.13

kraine

1.12

Activations Density 0.003%