INDEX

Explanations

phrases that indicate criminal acts or safety concerns related to gang activity

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ÑĤÑİ

-0.08

pga

-0.07

ICH

-0.07

 cach

-0.07

olding

-0.07

åĪĢ

-0.07

.construct

-0.07

NSS

-0.07

ÏĥÎ¯

-0.07

HUD

-0.07

POSITIVE LOGITS

igen

0.07

0.06

akes

0.06

dys

0.06

 Timber

0.06

logs

0.05

 slur

0.05

akening

0.05

 alien

0.05

Activations Density 0.009%