INDEX

Explanations

references to whales and their interactions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 å·Ŀ

-0.07

 Rivers

-0.07

Mismatch

-0.07

_TM

-0.06

 lake

-0.06

 Ferry

-0.06

cz

-0.06

 FileStream

-0.06

å·Ŀ

-0.06

Stream

-0.06

POSITIVE LOGITS

 whale

0.09

 Whale

0.09

 Herman

0.08

cet

0.08

 Levi

0.08

oby

0.08

 scrim

0.07

 Jonah

0.07

 whales

0.07

Activations Density 0.004%