INDEX

Explanations

trust

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 trust

-1.30

 Trust

-1.27

Trust

-1.11

trust

-1.09

 TRUST

-0.99

 trusts

-0.91

 Trusts

-0.78

 trusting

-0.77

TRUST

-0.71

<bos>

-0.69

POSITIVE LOGITS

 Reſ

0.69

kjø

0.63

 Anſ

0.62

 mourut

0.62

 themſelves

0.62

 Houſe

0.61

jee

0.60

 necessárias

0.59

 greateſt

0.59

 merve

0.59

Activations Density 0.029%