INDEX

Explanations

references to accountability and the consequences of actions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

heel

-0.08

endra

-0.08

strup

-0.07

UBY

-0.07

ÙĥÙĬØ¨

-0.07

makt

-0.07

á»ĵng

-0.07

TabPage

-0.06

ovan

-0.06

Soap

-0.06

POSITIVE LOGITS

 necess

0.14

 necessity

0.12

 forced

0.10

 nÃ©cess

0.10

å¿ħé¡»

0.10

 need

0.10

 buá»Ļc

0.10

 forcing

0.09

 require

0.09

need

0.09

Activations Density 0.033%