INDEX

Explanations

phrases indicating impending trouble or consequences

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

iyan

-0.07

Slf

-0.07

irÃ¡

-0.06

Dra

-0.06

 characteristic

-0.06

LAG

-0.06

stellung

-0.06

nett

-0.06

 text

-0.06

 Craft

-0.06

POSITIVE LOGITS

æĹıèĩªæ²»

0.06

 Tears

0.06

Tec

0.06

ifo

0.06

cot

0.06

atk

0.06

imo

0.06

 Verified

0.06

à¸ķà¸°

0.06

åŃ

0.06

Activations Density 0.002%