INDEX

Explanations

phrases indicating hypotheses or explanations for observed phenomena

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ombok

-0.06

Ī¬

-0.06

aos

-0.06

etu

-0.06

adel

-0.06

loy

-0.06

hod

-0.06

olumn

-0.06

riott

-0.06

jected

-0.06

POSITIVE LOGITS

due

0.09

 caused

0.08

 result

0.08

 simply

0.07

due

0.07

 Ø¨Ø³Ø¨Ø¨

0.07

.scalablytyped

0.07

 because

0.07

 CAUSED

0.07

uhl

0.07

Activations Density 0.036%