INDEX

Explanations

destruction

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 destroyed

-1.66

 destruction

-1.55

 destroys

-1.49

 destroying

-1.35

 destru

-1.34

 destroy

-1.30

destroyed

-1.27

 Destruction

-1.27

destruction

-1.22

 destr

-1.20

POSITIVE LOGITS

ed

0.59

able

0.59

0.55

0.54

ring

0.54

์

0.51

 насељу

0.50

)"),

0.48

어

0.47

Activations Density 1.571%