INDEX

Explanations

words indicating protection, prevention, and predictability related to social issues

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ld

-0.08

ified

-0.08

ical

-0.08

ify

-0.08

ãĤīãģĽ

-0.07

ÑĢÐ°Ð»

-0.07

/Area

-0.07

æĺĩ

-0.07

rica

-0.07

ald

-0.07

POSITIVE LOGITS

 nature

0.08

 Nature

0.07

combe

0.06

tion

0.06

ISM

0.06

orno

0.06

ism

0.06

ative

0.06

tsx

0.06

pie

0.06

Activations Density 0.094%