INDEX

Explanations

negative character traits and criticisms related to individuals or groups

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

RID

-0.06

ughter

-0.06

aska

-0.06

rd

-0.06

odore

-0.06

alker

-0.06

umen

-0.06

 stakes

-0.06

inine

-0.05

RD

-0.05

POSITIVE LOGITS

abbo

0.07

 wonder

0.07

 anomaly

0.07

proxy

0.07

beh

0.07

elts

0.07

 quasi

0.07

turnstile

0.07

gem

0.06

 erst

0.06

Activations Density 0.076%