INDEX

Explanations

despite

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

MMdd

-0.49

 Paglinawan

-0.49

chiha

-0.49

 grâce

-0.46

izable

-0.46

$'

-0.46

yska

-0.46

itization

-0.45

:+:

-0.43

 vorder

-0.43

POSITIVE LOGITS

 that

0.96

 which

0.77

UnusedPrivate

0.68

 having

0.62

ardless

0.59

 fact

0.57

 اینکه

0.56

 lesquels

0.55

 bahwa

0.55

 them

0.55

Activations Density 0.002%