INDEX

Explanations

the word "never" and its variations, indicating a focus on negation or absence

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

incy

-0.07

rank

-0.07

stadt

-0.07

.scalablytyped

-0.07

ule

-0.06

inz

-0.06

ozy

-0.06

odable

-0.06

 batt

-0.06

recogn

-0.06

POSITIVE LOGITS

Never

0.07

 never

0.07

 Never

0.07

ebi

0.06

 NEVER

0.06

never

0.06

jedn

0.06

 hearing

0.06

abilit

0.06

 Damen

0.06

Activations Density 0.014%