INDEX

Explanations

concepts related to cyber-security and protective measures against attacks

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Uniform

-0.06

Uniform

-0.06

 showers

-0.06

twe

-0.06

 killer

-0.06

ases

-0.06

 Union

-0.06

ase

-0.06

 uniform

-0.06

POSITIVE LOGITS

 plant

0.09

 control

0.09

ollower

0.08

 safety

0.08

plant

0.08

 Controllers

0.08

 Control

0.08

 controller

0.08

/control

0.08

 plants

0.08

Activations Density 0.105%