INDEX

Explanations

words related to body image and mental health issues

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ã³m

-0.08

ãĥıãĤ¤

-0.07

ÐºÑĥÐ»

-0.07

Äįka

-0.07

Ð¼Ñı

-0.07

Ã£ng

-0.07

à¤Ĥà¤§à¤¨

-0.07

swift

-0.07

ilton

-0.07

#ab

-0.07

POSITIVE LOGITS

 beauty

0.12

 Beauty

0.11

Beauty

0.09

 mirror

0.09

 confidence

0.09

 body

0.09

 self

0.09

 Self

0.09

auty

0.08

 vain

0.08

Activations Density 0.117%