INDEX

Explanations

phrases suggesting the importance of not solely relying on the speaker's claims

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.06

no

-0.06

fol

-0.06

 front

-0.05

 Stam

-0.05

erer

-0.05

bard

-0.05

 false

-0.05

Ãº

-0.05

rong

-0.05

POSITIVE LOGITS

alone

0.09

 trust

0.09

 alone

0.09

Trust

0.08

ÙħØ§ÙĨÛĮ

0.08

 Ð±Ð°Ð½ÐºÑĥ

0.08

 Alone

0.07

trust

0.07

 Trust

0.07

Ð¿Ð¾Ð²

0.07

Activations Density 0.004%