INDEX

Explanations

negative sentiment

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 walk

-1.02

 terrible

-0.96

 awful

-0.90

terrible

-0.83

 horrible

-0.79

 dreadful

-0.77

 buried

-0.77

Terrible

-0.71

als

-0.65

 horribly

-0.64

POSITIVE LOGITS

 protoimpl

0.62

DoubleQuotes

0.59

 Andromeda

0.58

incar

0.55

onymy

0.55

 pleaſure

0.55

αρα

0.54

 jadx

0.54

 spind

0.54

 overriding

0.53

Activations Density 0.028%