INDEX

Explanations

expressions of shame or related emotions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

sis

-0.08

sa

-0.08

ials

-0.08

ses

-0.07

 Äĳá»Ļ

-0.07

ship

-0.07

uality

-0.07

urg

-0.07

so

-0.07

lake

-0.07

POSITIVE LOGITS

lessly

0.17

fully

0.16

less

0.12

 shame

0.12

 Shame

0.10

ful

0.10

lessness

0.10

full

0.10

fulness

0.10

LESS

0.09

Activations Density 0.003%