INDEX

Explanations

terms related to challenges or warnings associated with risky behaviors

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Wunused

-0.07

ableView

-0.07

assis

-0.06

gor

-0.06

 fontStyle

-0.06

çĴ°

-0.06

inyin

-0.06

deer

-0.06

 remar

-0.06

 iceberg

-0.06

POSITIVE LOGITS

pus

0.07

ouver

0.07

ë²

0.06

airo

0.06

jest

0.06

Ð»Ð°ÑĪ

0.06

 Hatch

0.06

zept

0.06

 Rust

0.06

 Pillow

0.06

Activations Density 0.000%