INDEX

Explanations

mentions of positive actions and behaviors, particularly in a supportive or reinforcing context

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

istrovstvÃŃ

-0.07

usted

-0.07

ader

-0.06

 Carpenter

-0.06

ooth

-0.06

venge

-0.06

íħ

-0.06

quare

-0.06

íĥķ

-0.06

ä¿Ŀéļľ

-0.06

POSITIVE LOGITS

 successes

0.08

 positive

0.07

 good

0.07

ä¼ĺç§Ģ

0.07

 occasions

0.07

 accomplishments

0.07

 achievements

0.06

 examples

0.06

 Selector

0.06

esel

0.06

Activations Density 0.031%