INDEX

Explanations

negative self-talk

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 notched

-0.95

しくは

-0.95

 congratulated

-0.92

ázka

-0.87

 daß

-0.86

颊

-0.84

Pada

-0.84

Atentamente

-0.83

lapsible

-0.83

inoma

-0.83

POSITIVE LOGITS

 Negative

1.70

Negative

1.64

 negative

1.45

negative

1.28

 NEGATIVE

1.22

 consequences

1.20

(-)

1.20

NEGATIVE

1.11

 publicity

1.04

 connotations

1.03

Activations Density 0.014%