INDEX

Explanations

humiliation, insult, shame

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 noci

-0.88

 scary

-0.86

đèn

-0.85

 scared

-0.85

 positivos

-0.85

 alibi

-0.84

 conforto

-0.84

 sodio

-0.84

decrypt

-0.83

歟

-0.83

POSITIVE LOGITS

 humiliation

3.47

 humiliated

2.98

 humili

2.95

 humiliating

2.88

 уни

2.05

 insult

2.03

Hum

1.91

hum

1.91

 insults

1.73

Hum

1.66

Activations Density 0.028%