INDEX

Explanations

controlled experiments and variables

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Vladimir

-0.08

 pension

-0.08

.extend

-0.08

PRI

-0.07

Princess

-0.07

alwa

-0.07

Fuse

-0.07

pyg

-0.07

积

-0.07

czę

-0.07

POSITIVE LOGITS

 controlled

0.13

Controlled

0.13

 Controlled

0.12

controlled

0.12

 gecontrole

0.11

 rigor

0.11

-controlled

0.11

 Treatments

0.10

 controlar

0.10

 эксперимент

0.10

Activations Density 0.027%