INDEX

Explanations

escape from or escape code

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 tempi

-0.94

 guer

-0.84

 suivant

-0.84

 sexe

-0.82

IGHTS

-0.81

 marginBottom

-0.81

agi

-0.81

 ribu

-0.80

uaian

-0.80

我们要

-0.79

POSITIVE LOGITS

 Escape

1.40

goat

1.28

 route

1.28

 hatch

1.14

 escape

1.13

Es

1.13

 escap

1.11

Escape

1.09

 escapes

1.08

 mechanism

1.06

Activations Density 0.017%