INDEX

Explanations

phrases related to police interactions and accountability

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

aille

-0.07

Severity

-0.07

 ìĨ

-0.06

 bÃło

-0.06

spm

-0.06

spy

-0.06

ptom

-0.06

 mktime

-0.06

.borrow

-0.06

Ø±ÛĮÙħ

-0.06

POSITIVE LOGITS

 comply

0.12

 complied

0.12

 compliance

0.11

 complying

0.11

 surrender

0.10

 compliant

0.10

 Compliance

0.10

 handc

0.10

 cooperate

0.10

 cooper

0.10

Activations Density 0.021%