INDEX

Explanations

pronouns saying or thinking

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 lika

-1.19

Cinta

-1.13

Moda

-1.12

bilden

-1.12

 klaus

-1.07

 plafon

-1.07

rah

-1.07



-1.06

Aktu

-1.06

 tarif

-1.06

POSITIVE LOGITS

 said

1.62

 says

1.55

say

1.16

 should

1.13

 noted

1.07

And

1.00

he

0.97

に来て

0.96

Note

0.94

 spune

0.92

Activations Density 0.011%