INDEX

Explanations

the presence of decision-making and the consequences related to choices

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 compared

-0.06

æ²¢

-0.06

 accordingly

-0.06

 Dude

-0.06

ilot

-0.06

rada

-0.06

IQ

-0.06

amp

-0.06

alth

-0.06

acked

-0.06

POSITIVE LOGITS

 otherwise

0.13

 Otherwise

0.13

Otherwise

0.12

otherwise

0.12

åĲ¦

0.11

 OTHERWISE

0.10

 Nope

0.09

 naopak

0.09

 else

0.08

 opposite

0.08

Activations Density 0.054%