INDEX

Explanations

concepts related to control and authority dynamics

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ð»Ð¸Ð²

-0.07

addir

-0.07

liÄį

-0.07

acus

-0.07

forman

-0.07

PointerException

-0.07

ãĥ³ãĥĨãĤ£

-0.07

 Klopp

-0.07

ÑĥÑģÐ°

-0.07

POSITIVE LOGITS

 control

0.18

control

0.15

 Control

0.14

-control

0.13

Control

0.13

.control

0.12

/control

0.12

 CONTROL

0.12

 influence

0.12

 controls

0.12

Activations Density 0.050%